org.icepdf.core.pobjects
Class Document

java.lang.Object
  extended by org.icepdf.core.pobjects.Document

public class Document
extends java.lang.Object

The Document class represents a PDF document and provides access to the hierarchy of objects contained in the body section of the PDF document. Most of the objects in the hierarchy are dictionaries which contain references to page content and other objects such such as annotations. For more information on the document object hierarchy, see the ICEpdf Developer's Guide.

The Document class also provides access to methods responsible for rendering PDF document content. Methods are available to capture page content to a graphics context or extract image and text data on a page-by-page basis.

If your PDF rendering application will be accessing encrypted documents, it is important to implement the SecurityCallback. This interface provides methods for getting password data from a user if needed.

Since:
1.0

Constructor Summary
Document()
          Creates a new instance of a Document.
 
Method Summary
protected  long appendIncrementalUpdate(java.io.OutputStream out, long documentLength)
          If ICEpdf Pro, then use append an incremental update of any edits.
 void dispose()
          Dispose of Document, freeing up all used resources.
 Catalog getCatalog()
          Gets the Document's Catalog as specified by the Document hierarchy.
 java.lang.String getDocumentLocation()
          Returns the file location or URL of this Document.
 java.lang.String getDocumentOrigin()
          Returns the origin (filepath or URL) of this Document.
 PInfo getInfo()
          Gets the document's information as specified in the PTrailer in the document hierarchy.
static java.lang.String getLibraryVersion()
          Gets the version number of ICEpdf rendering core.
 int getNumberOfPages()
          Returns the total number of pages in this document.
 PDimension getPageDimension(int pageNumber, float userRotation)
          Gets the page dimension of the indicated page number using the specified rotation factor.
 PDimension getPageDimension(int pageNumber, float userRotation, float userZoom)
          Gets the page dimension of the indicated page number using the specified rotation and zoom settings.
 java.awt.Image getPageImage(int pageNumber, int renderHintType, int pageBoundary, float userRotation, float userZoom)
          Gets an Image of the specified page.
 java.util.Vector getPageImages(int pageNumber)
          Gets a vector of Images where each index represents an image inside the specified page.
 PageText getPageText(int pageNumber)
          Exposes a page's PageText object which can be used to get text with in the PDF document.
 PageTree getPageTree()
          Gets the Document Catalog's PageTree entry as specified by the Document hierarchy.
 PageText getPageViewText(int pageNumber)
          Exposes a page's PageText object which can be used to get text with in the PDF document.
 SecurityManager getSecurityManager()
          Gets the security manager for this document.
 StateManager getStateManager()
          Gets an instance of the the document state manager which stores references of object that need to be written to file.
 void paintPage(int pageNumber, java.awt.Graphics g, int renderHintType, int pageBoundary, float userRotation, float userZoom)
          Paints the contents of the given page number to the graphics context using the specified rotation, zoom, rendering hints and page boundary.
 long saveToOutputStream(java.io.OutputStream out)
          Copies the pre-existing PDF file, and appends an incremental update for any edits, to the specified OutputStream.
 void setByteArray(byte[] data, int offset, int length, java.lang.String pathOrURL)
          Load a PDF file from the given byte array and initiates the document's Catalog.
 void setFile(java.lang.String filepath)
          Load a PDF file from the given path and initiates the document's Catalog.
 void setInputStream(java.io.InputStream in, java.lang.String pathOrURL)
          Load a PDF file from the given input stream and initiates the document's Catalog.
 void setInputStream(SeekableInput in, java.lang.String pathOrURL)
          Load a PDF file from the given SeekableInput stream and initiates the document's Catalog.
 void setSecurityCallback(SecurityCallback securityCallback)
          Sets the security callback to be used for this document.
 void setUrl(java.net.URL url)
          Load a PDF file from the given URL and initiates the document's Catalog.
 long writeToOutputStream(java.io.OutputStream out)
          Takes the internal PDF data, which may be in a file or in RAM, and write it to the provided OutputStream.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Document

public Document()
Creates a new instance of a Document. A Document class represents one PDF document.

Method Detail

getLibraryVersion

public static java.lang.String getLibraryVersion()
Gets the version number of ICEpdf rendering core. This is not the version number of the PDF format used to encode this document.

Returns:
version number of ICEpdf's rendering core.

setFile

public void setFile(java.lang.String filepath)
             throws PDFException,
                    PDFSecurityException,
                    java.io.IOException
Load a PDF file from the given path and initiates the document's Catalog.

Parameters:
filepath - path of PDF document.
Throws:
PDFException - if an invalid file encoding.
PDFSecurityException - if a security provider cannot be found or there is an error decrypting the file.
java.io.IOException - if a problem setting up, or parsing the file.

setUrl

public void setUrl(java.net.URL url)
            throws PDFException,
                   PDFSecurityException,
                   java.io.IOException
Load a PDF file from the given URL and initiates the document's Catalog. If the system property org.icepdf.core.streamcache.enabled=true, the file will be cached to a temp file; otherwise, the complete document stream will be stored in memory.

Parameters:
url - location of file.
Throws:
PDFException - an invalid file encoding.
PDFSecurityException - if a security provider can not be found or there is an error decrypting the file.
java.io.IOException - if a problem downloading, setting up, or parsing the file.

setInputStream

public void setInputStream(java.io.InputStream in,
                           java.lang.String pathOrURL)
                    throws PDFException,
                           PDFSecurityException,
                           java.io.IOException
Load a PDF file from the given input stream and initiates the document's Catalog. If the system property org.icepdf.core.streamcache.enabled=true, the file will be cached to a temp file; otherwise, the complete document stream will be stored in memory.

Parameters:
in - input stream containing PDF data
pathOrURL - value assigned to document origin
Throws:
PDFException - an invalid stream or file encoding
PDFSecurityException - if a security provider can not be found or there is an error decrypting the file.
java.io.IOException - if a problem setting up, or parsing the SeekableInput.

setByteArray

public void setByteArray(byte[] data,
                         int offset,
                         int length,
                         java.lang.String pathOrURL)
                  throws PDFException,
                         PDFSecurityException,
                         java.io.IOException
Load a PDF file from the given byte array and initiates the document's Catalog. If the system propertyorg.icepdf.core.streamcache.enabled=true, the file will be cached to a temp file; otherwise, the complete document stream will be stored in memory. The given byte array is not necessarily copied, and will try to be directly used, so do not modify it after passing it to this method.

Parameters:
data - byte array containing PDF data
offset - the index into the byte array where the PDF data begins
length - the number of bytes in the byte array belonging to the PDF data
pathOrURL - value assigned to document origin
Throws:
PDFException - an invalid stream or file encoding
PDFSecurityException - if a security provider can not be found or there is an error decrypting the file.
java.io.IOException - if a problem setting up, or parsing the SeekableInput.

setInputStream

public void setInputStream(SeekableInput in,
                           java.lang.String pathOrURL)
                    throws PDFException,
                           PDFSecurityException,
                           java.io.IOException
Load a PDF file from the given SeekableInput stream and initiates the document's Catalog.

Parameters:
in - input stream containing PDF data
pathOrURL - value assigned to document origin
Throws:
PDFException - an invalid stream or file encoding
PDFSecurityException - if a security provider can not be found or there is an error decrypting the file.
java.io.IOException - if a problem setting up, or parsing the SeekableInput.

getPageDimension

public PDimension getPageDimension(int pageNumber,
                                   float userRotation)
Gets the page dimension of the indicated page number using the specified rotation factor.

Parameters:
pageNumber - Page number for the given dimension. The page number is zero-based.
userRotation - Rotation, in degrees, that has been applied to page when calculating the dimension.
Returns:
page dimension for the specified page number
See Also:
getPageDimension(int, float, float)

getPageDimension

public PDimension getPageDimension(int pageNumber,
                                   float userRotation,
                                   float userZoom)
Gets the page dimension of the indicated page number using the specified rotation and zoom settings. If the page does not exist then a zero dimension is returned.

Parameters:
pageNumber - Page number for the given dimension. The page number is zero-based.
userRotation - Rotation, in degrees, that has been applied to page when calculating the dimension.
userZoom - Any deviation from the page's actual size, by zooming in or out.
Returns:
page dimension for the specified page number.
See Also:
getPageDimension(int, float)

getDocumentOrigin

public java.lang.String getDocumentOrigin()
Returns the origin (filepath or URL) of this Document. This is the original location of the file where the method getDocumentLocation returns the actual location of the file. The origin and location of the document will only be different if it was loaded from a URL or an input stream.

Returns:
file path or URL
See Also:
getDocumentLocation()

getDocumentLocation

public java.lang.String getDocumentLocation()
Returns the file location or URL of this Document. This location may be different from the file origin if the document was loaded from a URL or input stream. If the file was loaded from a URL or input stream the file location is the path to where the document content is cached.

Returns:
file path
See Also:
getDocumentOrigin()

getStateManager

public StateManager getStateManager()
Gets an instance of the the document state manager which stores references of object that need to be written to file.

Returns:
stateManager instance for this document.

getNumberOfPages

public int getNumberOfPages()
Returns the total number of pages in this document.

Returns:
number of pages in the document

paintPage

public void paintPage(int pageNumber,
                      java.awt.Graphics g,
                      int renderHintType,
                      int pageBoundary,
                      float userRotation,
                      float userZoom)
Paints the contents of the given page number to the graphics context using the specified rotation, zoom, rendering hints and page boundary.

Parameters:
pageNumber - Page number to paint. The page number is zero-based.
g - graphics context to which the page content will be painted.
renderHintType - Constant specified by the GraphicsRenderingHints class. There are two possible entries, SCREEN and PRINT, each with configurable rendering hints settings.
pageBoundary - Constant specifying the page boundary to use when painting the page content.
userRotation - Rotation factor, in degrees, to be applied to the rendered page.
userZoom - Zoom factor to be applied to the rendered page.

dispose

public void dispose()
Dispose of Document, freeing up all used resources.


writeToOutputStream

public long writeToOutputStream(java.io.OutputStream out)
                         throws java.io.IOException
Takes the internal PDF data, which may be in a file or in RAM, and write it to the provided OutputStream. The OutputStream is not flushed or closed, in case this method's caller requires otherwise.

Parameters:
out - OutputStream to which the PDF file bytes are written.
Returns:
The length of the PDF file copied
Throws:
java.io.IOException - if there is some problem reading or writing the PDF data

saveToOutputStream

public long saveToOutputStream(java.io.OutputStream out)
                        throws java.io.IOException
Copies the pre-existing PDF file, and appends an incremental update for any edits, to the specified OutputStream. For the pre-existing PDF content copying, writeToOutputStream(OutputStream out) is used.

Parameters:
out - OutputStream to which the PDF file bytes are written.
Returns:
The length of the PDF file saved
Throws:
java.io.IOException - if there is some problem reading or writing the PDF data

appendIncrementalUpdate

protected long appendIncrementalUpdate(java.io.OutputStream out,
                                       long documentLength)
                                throws java.io.IOException
If ICEpdf Pro, then use append an incremental update of any edits.

Parameters:
out - OutputStream to which the incremental update bytes are written.
documentLength - Length of the PDF file sp far, before the incremental update.
Returns:
The number of bytes written for the incremental update.
Throws:
java.io.IOException

getPageImage

public java.awt.Image getPageImage(int pageNumber,
                                   int renderHintType,
                                   int pageBoundary,
                                   float userRotation,
                                   float userZoom)
Gets an Image of the specified page. The image size is automatically calculated given the page boundary, user rotation and zoom. The rendering quality is defined by GraphicsRenderingHints.SCREEN.

Parameters:
pageNumber - Page number of the page to capture the image rendering. The page number is zero-based.
renderHintType - Constant specified by the GraphicsRenderingHints class. There are two possible entries, SCREEN and PRINT each with configurable rendering hints settings.
pageBoundary - Constant specifying the page boundary to use when painting the page content. Typically use Page.BOUNDARY_CROPBOX.
userRotation - Rotation factor, in degrees, to be applied to the rendered page. Arbitrary rotations are not currently supported for this method, so only the following values are valid: 0.0f, 90.0f, 180.0f, 270.0f.
userZoom - Zoom factor to be applied to the rendered page.
Returns:
an Image object of the current page.

getPageText

public PageText getPageText(int pageNumber)
Exposes a page's PageText object which can be used to get text with in the PDF document. The PageText.toString() is the simplest way to get a pages text. This utility call does not parse the whole stream and is best suited for text extraction functionality as it faster then #getPageViewText(int).

Parameters:
pageNumber - Page number of page in which text extraction will act on. The page number is zero-based.
Returns:
page PageText data Structure.
See Also:
getPageViewText(int).

getPageViewText

public PageText getPageViewText(int pageNumber)
Exposes a page's PageText object which can be used to get text with in the PDF document. The PageText.toString() is the simplest way to get a pages text. The pageText hierarchy can be used to search for selected text or used to set text as highlighted.

Parameters:
pageNumber - Page number of page in which text extraction will act on. The page number is zero-based.
Returns:
page PageText data Structure.

getSecurityManager

public SecurityManager getSecurityManager()
Gets the security manager for this document. If the document has no security manager null is returned.

Returns:
security manager for document if available.

setSecurityCallback

public void setSecurityCallback(SecurityCallback securityCallback)
Sets the security callback to be used for this document. The security callback allows a mechanism for prompting a user for a password if the document is password protected.

Parameters:
securityCallback - a class which implements the SecurityCallback interface.

getInfo

public PInfo getInfo()
Gets the document's information as specified in the PTrailer in the document hierarchy.

Returns:
document information
See Also:
for more information.

getPageImages

public java.util.Vector getPageImages(int pageNumber)
Gets a vector of Images where each index represents an image inside the specified page. The images are returned in the size in which they where embedded in the PDF document, which may be different than the size displayed when the complete PDF page is rendered.

Parameters:
pageNumber - page number to act on. Zero-based page number.
Returns:
vector of Images inside the current page

getPageTree

public PageTree getPageTree()
Gets the Document Catalog's PageTree entry as specified by the Document hierarchy. The PageTree can be used to obtain detailed information about the Page object which makes up the document.

Returns:
PageTree specified by the document hierarchy.

getCatalog

public Catalog getCatalog()
Gets the Document's Catalog as specified by the Document hierarchy. The Catalog can be used to traverse the Document's hierarchy.

Returns:
document's Catalog object; null, if one does not exist.