org.xcmis.spi.utils
Class CmisPDFDocumentReader

java.lang.Object
  extended by org.exoplatform.container.component.BaseComponentPlugin
      extended by org.exoplatform.services.document.impl.BaseDocumentReader
          extended by org.xcmis.spi.utils.CmisPDFDocumentReader
All Implemented Interfaces:
org.exoplatform.container.component.ComponentPlugin, org.exoplatform.services.document.DocumentReader

public class CmisPDFDocumentReader
extends org.exoplatform.services.document.impl.BaseDocumentReader

Version:
$Id: exo-jboss-codetemplates.xml 34360 2009-07-22 23:58:59Z ksm $
Author:
Sergey Kabashnyuk

Field Summary
 
Fields inherited from class org.exoplatform.container.component.BaseComponentPlugin
desc, name
 
Constructor Summary
CmisPDFDocumentReader()
           
 
Method Summary
 String getContentAsText(InputStream is)
          Returns only a text from pdf file content.
 String getContentAsText(InputStream is, String encoding)
           
 String[] getMimeTypes()
          Get the application/pdf mime type.
 Properties getProperties(InputStream is)
           
protected  Properties getPropertiesFromInfo(HashMap info)
          Extracts properties from PDF Info hash set.
protected  Properties getPropertiesFromMetadata(byte[] metadata)
          Extract properties from XMP xml.
 
Methods inherited from class org.exoplatform.container.component.BaseComponentPlugin
getDescription, getName, setDescription, setName
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

CmisPDFDocumentReader

public CmisPDFDocumentReader()
Method Detail

getMimeTypes

public String[] getMimeTypes()
Get the application/pdf mime type.

Returns:
The application/pdf mime type.

getContentAsText

public String getContentAsText(InputStream is)
                        throws IOException,
                               org.exoplatform.services.document.DocumentReadException
Returns only a text from pdf file content.

Parameters:
is - an input stream with .pdf file content.
Returns:
The string only with text from file content.
Throws:
IOException - when an I\O error occurs
org.exoplatform.services.document.DocumentReadException - if some document reading error

getContentAsText

public String getContentAsText(InputStream is,
                               String encoding)
                        throws IOException,
                               org.exoplatform.services.document.DocumentReadException
Throws:
IOException
org.exoplatform.services.document.DocumentReadException

getProperties

public Properties getProperties(InputStream is)
                         throws IOException,
                                org.exoplatform.services.document.DocumentReadException
Throws:
IOException
org.exoplatform.services.document.DocumentReadException

getPropertiesFromMetadata

protected Properties getPropertiesFromMetadata(byte[] metadata)
                                        throws IOException,
                                               org.exoplatform.services.document.DocumentReadException
Extract properties from XMP xml.

Parameters:
metadata - XML as byte array
Returns:
extracted properties
Throws:
org.exoplatform.services.document.DocumentReadException
org.exoplatform.services.document.DocumentReadException - if extracting fails
IOException

getPropertiesFromInfo

protected Properties getPropertiesFromInfo(HashMap info)
                                    throws IOException
Extracts properties from PDF Info hash set.

Parameters:
Pdf - Info hash set
Returns:
Extracted properties
Throws:
IOException - if extracting fails


Copyright © 2010 eXo Platform SAS. All Rights Reserved.