org.icepdf.core.util.content
Class OContentParser

java.lang.Object
  extended by org.icepdf.core.util.content.AbstractContentParser
      extended by org.icepdf.core.util.content.OContentParser
All Implemented Interfaces:
ContentParser

public class OContentParser
extends AbstractContentParser

The ContentParser is responsible for parsing a page's content streams. The parsed text, image and other PDF object types are added the pages Shapes object for later drawing and display.


Field Summary
 
Fields inherited from class org.icepdf.core.util.content.AbstractContentParser
geometricPath, glyph2UserSpaceScale, graphicState, inTextBlock, library, oCGs, OVERPAINT_ALPHA, resources, shapes, stack, textBlockBase
 
Constructor Summary
OContentParser(Library l, Resources r)
           
 
Method Summary
 ContentParser parse(byte[][] streamBytes)
          Parse a pages content stream.
 Shapes parseTextBlocks(byte[][] source)
          Specialized method for extracting text from documents.
 
Methods inherited from class org.icepdf.core.util.content.AbstractContentParser
applyTextScaling, commonFill, commonOverPrintAlpha, commonStroke, consume_b_star, consume_B_star, consume_b, consume_B, consume_BDC, consume_BMC, consume_c, consume_cm, consume_cs, consume_CS, consume_d, consume_d0, consume_d1, consume_Do, consume_double_quote, consume_DP, consume_EMC, consume_f_star, consume_f, consume_F, consume_g, consume_G, consume_gs, consume_h, consume_i, consume_j, consume_J, consume_k, consume_K, consume_L, consume_M, consume_m, consume_MP, consume_n, consume_q, consume_Q, consume_re, consume_rg, consume_RG, consume_ri, consume_s, consume_S, consume_sc, consume_SC, consume_sh, consume_single_quote, consume_T_star, consume_Tc, consume_Td, consume_TD, consume_Tf, consume_Tj, consume_TJ, consume_TL, consume_tm, consume_Tr, consume_Ts, consume_Tw, consume_Tz, consume_v, consume_W_star, consume_W, consume_w, consume_y, drawModeFill, drawModeFillStroke, drawModeStroke, drawString, getGraphicsState, getShapes, getStack, setAlpha, setGlyph2UserSpaceScale, setGraphicsState, setStroke
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

OContentParser

public OContentParser(Library l,
                      Resources r)
Parameters:
l - PDF library master object.
r - resources
Method Detail

parse

public ContentParser parse(byte[][] streamBytes)
                    throws java.lang.InterruptedException,
                           java.io.IOException
Parse a pages content stream.

Specified by:
parse in interface ContentParser
Specified by:
parse in class AbstractContentParser
Parameters:
streamBytes - byte stream containing page content
Returns:
a Shapes Object containing all the pages text and images shapes.
Throws:
java.lang.InterruptedException - if current parse thread is interrupted.
java.io.IOException - unexpected end of content stream.

parseTextBlocks

public Shapes parseTextBlocks(byte[][] source)
                       throws java.io.UnsupportedEncodingException
Specialized method for extracting text from documents.

Specified by:
parseTextBlocks in interface ContentParser
Specified by:
parseTextBlocks in class AbstractContentParser
Parameters:
source - content stream source.
Returns:
vector where each entry is the text extracted from a text block.
Throws:
java.io.UnsupportedEncodingException - encoding error.