public class PDFText2Markdown extends PDFTextStripper
charactersByArticle, document, LINE_SEPARATOR, output| Constructor and Description |
|---|
PDFText2Markdown()
Constructor.
|
| Modifier and Type | Method and Description |
|---|---|
protected float |
computeFontHeight(PDFont font)
Compute the font height.
|
protected void |
endArticle()
Write out the article separator.
|
protected void |
showGlyph(Matrix textRenderingMatrix,
PDFont font,
int code,
String unicode,
Vector displacement)
Called when a glyph is to be processed.
|
protected void |
startArticle(boolean isLTR)
Write out the article separator with proper text direction information.
|
protected void |
writeParagraphEnd()
Writes the Markdown paragraph end to the output.
|
protected void |
writeString(String chars)
Write a string to the output stream and escape some Markdown characters.
|
protected void |
writeString(String text,
List<TextPosition> textPositions)
Write a string to the output stream, maintain font state, and escape some Markdown
characters.
|
beginMarkedContentSequence, endDocument, endMarkedContentSequence, endPage, getAddMoreFormatting, getArticleEnd, getArticleStart, getAverageCharTolerance, getCharactersByArticle, getCurrentPageNo, getDropThreshold, getEndBookmark, getEndPage, getIgnoreContentStreamSpaceGlyphs, getIndentThreshold, getLineSeparator, getListItemPatterns, getOutput, getPageEnd, getPageStart, getParagraphEnd, getParagraphStart, getSeparateByBeads, getSortByPosition, getSpacingTolerance, getStartBookmark, getStartPage, getSuppressDuplicateOverlappingText, getText, getWordSeparator, matchPattern, processPage, processPages, processTextPosition, setAddMoreFormatting, setArticleEnd, setArticleStart, setAverageCharTolerance, setDropThreshold, setEndBookmark, setEndPage, setIgnoreContentStreamSpaceGlyphs, setIndentThreshold, setLineSeparator, setListItemPatterns, setPageEnd, setPageStart, setParagraphEnd, setParagraphStart, setShouldSeparateByBeads, setSortByPosition, setSpacingTolerance, setStartBookmark, setStartPage, setSuppressDuplicateOverlappingText, setWordSeparator, startArticle, startDocument, startPage, writeCharacters, writeLineSeparator, writePage, writePageEnd, writePageStart, writeParagraphSeparator, writeParagraphStart, writeText, writeWordSeparatoraddOperator, applyTextAdjustment, beginText, decreaseLevel, endText, getAppearance, getCurrentPage, getGraphicsStackSize, getGraphicsState, getInitialMatrix, getLevel, getResources, getTextLineMatrix, getTextMatrix, increaseLevel, isShouldProcessColorOperators, operatorException, processAnnotation, processChildStream, processOperator, processOperator, processSoftMask, processTilingPattern, processTilingPattern, processTransparencyGroup, processType3Stream, registerOperatorProcessor, restoreGraphicsStack, restoreGraphicsState, saveGraphicsStack, saveGraphicsState, setLineDashPattern, setTextLineMatrix, setTextMatrix, showAnnotation, showFontGlyph, showFontGlyph, showForm, showGlyph, showText, showTextString, showTextStrings, showTransparencyGroup, showType3Glyph, showType3Glyph, transformedPoint, transformWidth, unsupportedOperatorpublic PDFText2Markdown()
throws IOException
IOException - If there is an error during initialization.protected void startArticle(boolean isLTR)
throws IOException
startArticle in class PDFTextStripperisLTR - true if direction of text is left to rightIOException - If there is an error writing to the stream.protected void endArticle()
throws IOException
endArticle in class PDFTextStripperIOException - If there is an error writing to the stream.protected void writeString(String text, List<TextPosition> textPositions) throws IOException
writeString in class PDFTextStrippertext - The text to write to the stream.textPositions - The corresponding text positions.IOException - If there is an error writing to the stream.protected void writeString(String chars) throws IOException
writeString in class PDFTextStripperchars - String to be written to the stream.IOException - If there is an error writing to the stream.protected void writeParagraphEnd()
throws IOException
Write something (if defined) at the end of a paragraph.
writeParagraphEnd in class PDFTextStripperIOException - if something went wrongprotected void showGlyph(Matrix textRenderingMatrix, PDFont font, int code, String unicode, Vector displacement) throws IOException
showGlyph in class PDFStreamEnginetextRenderingMatrix - the current text rendering matrix, Trmfont - the current fontcode - internal PDF character code for the glyphunicode - the Unicode text for this glyph, or null if the PDF does provide itdisplacement - the displacement (i.e. advance) of the glyph in text spaceIOException - if the glyph cannot be processedprotected float computeFontHeight(PDFont font) throws IOException
font - the font.IOException - if there is an error while getting the font bounding box.Copyright © 2002–2025 The Apache Software Foundation. All rights reserved.