public class WhitespaceTokenizer
extends org.apache.lucene.analysis.CharTokenizer
Version compatibility when creating
WhitespaceTokenizer:
CharTokenizer uses an int based API to normalize and
detect token characters. See CharTokenizer.isTokenChar(int) and
CharTokenizer.normalize(int) for details.| Constructor and Description |
|---|
WhitespaceTokenizer(org.apache.lucene.util.Version matchVersion,
org.apache.lucene.util.AttributeSource.AttributeFactory factory,
Reader in)
Construct a new WhitespaceTokenizer using a given
AttributeSource.AttributeFactory. |
WhitespaceTokenizer(org.apache.lucene.util.Version matchVersion,
org.apache.lucene.util.AttributeSource source,
Reader in)
Construct a new WhitespaceTokenizer using a given
AttributeSource. |
WhitespaceTokenizer(org.apache.lucene.util.Version matchVersion,
Reader in)
Construct a new WhitespaceTokenizer.
|
| Modifier and Type | Method and Description |
|---|---|
protected boolean |
isTokenChar(int c)
Collects only characters which do not satisfy
Character.isWhitespace(int). |
end, incrementToken, isTokenChar, normalize, normalize, resetaddAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState, toStringpublic WhitespaceTokenizer(org.apache.lucene.util.Version matchVersion,
Reader in)
in - the input to split up into tokenspublic WhitespaceTokenizer(org.apache.lucene.util.Version matchVersion,
org.apache.lucene.util.AttributeSource source,
Reader in)
AttributeSource.matchVersion - Lucene version to matchsource - the attribute source to use for this Tokenizerin - the input to split up into tokenspublic WhitespaceTokenizer(org.apache.lucene.util.Version matchVersion,
org.apache.lucene.util.AttributeSource.AttributeFactory factory,
Reader in)
AttributeSource.AttributeFactory.matchVersion - Lucene version to match See
<a href="#version">above</a>factory - the attribute factory to use for this Tokenizerin - the input to split up into tokensprotected boolean isTokenChar(int c)
Character.isWhitespace(int).isTokenChar in class org.apache.lucene.analysis.CharTokenizerCopyright © 2017 eXo Platform SAS. All Rights Reserved.