You can make you own DocumentReader in two ways.
Old-Style Document Reader:
extend BaseDocumentReader
public class MyDocumentReader extends BaseDocumentReader
{
public String[] getMimeTypes()
{
return new String[]{"mymimetype"};
}
...
}
register it as component-plugin
<component-plugin>
<name>my.DocumentReader</name>
<set-method>addDocumentReader</set-method>
<type>com.mycompany.document.MyDocumentReader</type>
<description>to read my own file format</description>
</component-plugin>
Tika Parser:
implement Parser
public class MyParser implements Parser
{
...
}
register it in tika-config.xml
<parser name="parse-mydocument" class="com.mycompany.document.MyParser">
<mime>mymimetype</mime>
</parser>