org.icepdf.core.util
Class ContentParser

java.lang.Object
  extended by org.icepdf.core.util.ContentParser

public class ContentParser
extends java.lang.Object

The ContentParser is responsible for parsing a page's content streams. The parsed text, image and other PDF object types are added the pages Shapes object for later drawing and display.


Field Summary
static float OVERPAINT_ALPHA
           
 
Constructor Summary
ContentParser(Library l, Resources r)
           
 
Method Summary
 GraphicsState getGraphicsState()
          Returns the current graphics state object being used by this content stream.
 Shapes parse(java.io.InputStream source)
          Parse a pages content stream.
 Shapes parseTextBlocks(java.io.InputStream source)
          Specialized method for extracting text from documents.
 void setGraphicsState(GraphicsState graphicState)
          Sets the graphics state object which will be used for the current content parsing.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

OVERPAINT_ALPHA

public static final float OVERPAINT_ALPHA
See Also:
Constant Field Values
Constructor Detail

ContentParser

public ContentParser(Library l,
                     Resources r)
Parameters:
l - PDF library master object.
r - resources
Method Detail

getGraphicsState

public GraphicsState getGraphicsState()
Returns the current graphics state object being used by this content stream.

Returns:
current graphics context of content stream. May be null if parse method has not been previously called.

setGraphicsState

public void setGraphicsState(GraphicsState graphicState)
Sets the graphics state object which will be used for the current content parsing. This method must be called before the parse method is called otherwise it will not have an effect on the state of the draw operands.

Parameters:
graphicState - graphics state of this content stream

parse

public Shapes parse(java.io.InputStream source)
             throws java.lang.InterruptedException
Parse a pages content stream.

Parameters:
source - byte stream containing page content
Returns:
a Shapes Ojbect containing all the pages text and images shapes.
Throws:
java.lang.InterruptedException - if current parse thread is interruped.

parseTextBlocks

public Shapes parseTextBlocks(java.io.InputStream source)
Specialized method for extracting text from documents.

Parameters:
source - content stream source.
Returns:
vector where each entry is the text extracted from a text block.