java.lang.Object
com.lowagie.text.pdf.parser.ParsedTextImpl
com.lowagie.text.pdf.parser.ParsedText
- All Implemented Interfaces:
TextAssemblyBuffer
- Author:
- dgd
-
Method Summary
Modifier and TypeMethodDescriptionvoidaccumulate(TextAssembler textAssembler, String contextName) We pass ourselves to the assembler, which is a visitor, so that it can accumulate information on this text depending on its type.voidassemble(TextAssembler textAssembler) booleanprotected StringThis constructor should only be called when the origin for text display is at (0,0) and the graphical state reflects all transformations of the baseline.protected StringDecodes a Java String containing glyph ids encoded in the font's encoding, and determine the unicode equivalentBreak this string if there are spaces within it.getFinalText(PdfReader reader, int page, TextAssembler assembler, boolean useMarkup) getText()when returning the text from this item, we need to decode the code points we have.floatbooleantoString()Methods inherited from class com.lowagie.text.pdf.parser.ParsedTextImpl
getAscent, getBaseline, getDescent, getEndPoint, getSingleSpaceWidth, getStartPoint, getWidth
-
Method Details
-
decode
Decodes a Java String containing glyph ids encoded in the font's encoding, and determine the unicode equivalent- Parameters:
in- the String that needs to be decoded- Returns:
- the decoded String
-
decode
This constructor should only be called when the origin for text display is at (0,0) and the graphical state reflects all transformations of the baseline. This is in text space units.Decodes a PdfString (which will contain glyph ids encoded in the font's encoding) based on the active font, and determine the unicode equivalent
- Parameters:
pdfString- the String that needs to be encoded- Returns:
- the encoded String
- Since:
- 2.1.7
-
getAsPartialWords
Break this string if there are spaces within it. If so, we mark the new Words appropriately for later assembly.We are guaranteed that every space (internal word break) in this parsed text object will create a new word in the result of this method. We are not guaranteed that these Word objects are actually words until they have been assembled.
The word following any space preserves that space in its string value, so that the assembler will not erroneously merge words that should be separate, regardless of the spacing.
- Returns:
- list of Word objects.
-
getUnscaledTextWidth
- Parameters:
gs- graphic state including current transformation to page coordinates from text measurement- Returns:
- the unscaled (i.e. in Text space) width of our text
-
accumulate
We pass ourselves to the assembler, which is a visitor, so that it can accumulate information on this text depending on its type. The result is calculated by a final "assembly" phase, after accumulation is done. This is because we may have non-contiguous items in a PDF text stream.- Parameters:
textAssembler- the assembler that is visiting us.contextName- Name of the surrounding markup element/"context" if we're generating tagged output.- See Also:
-
assemble
- Parameters:
textAssembler- we may pass ourselves to this assembler again during the final assembly process.- See Also:
-
getText
when returning the text from this item, we need to decode the code points we have.- Specified by:
getTextin interfaceTextAssemblyBuffer- Overrides:
getTextin classParsedTextImpl- Returns:
- the text to render
- See Also:
-
getFontCodes
- Returns:
- a string whose characters represent code points in a possibly two-byte font
-
getFinalText
public FinalText getFinalText(PdfReader reader, int page, TextAssembler assembler, boolean useMarkup) - Parameters:
reader- pdfReader that knows about our document. (size, etc. available here).page- which page are we extracting text from.assembler- Builds result by accepting content from text components of various sorts.useMarkup- Should we generate tagged text, or just plain text.- Returns:
- the final text ready to concatenate into result string.
- See Also:
-
toString
-
shouldNotSplit
public boolean shouldNotSplit()- Specified by:
shouldNotSplitin classParsedTextImpl- Returns:
- true if this was extracted from a string containing spaces, in which case, we assume further splitting is not needed.
- See Also:
-
breakBefore
public boolean breakBefore()- Specified by:
breakBeforein classParsedTextImpl- Returns:
- a boolean value
- See Also:
-