org.crosswire.jsword.index.lucene.analysis
Class SimpleLuceneAnalyzer
java.lang.Object
org.apache.lucene.analysis.Analyzer
org.crosswire.jsword.index.lucene.analysis.AbstractBookAnalyzer
org.crosswire.jsword.index.lucene.analysis.SimpleLuceneAnalyzer
public class SimpleLuceneAnalyzer
- extends AbstractBookAnalyzer
Simple Analyzer providing same function as org.apache.lucene.analysis.SimpleAnalyzer
This is intended to be the default analyzer for natural language fields.
Additionally performs:
Normalize Diacritics (Changes Accented characters to their unaccented equivalent) for ISO 8859-1 languages
Note: Next Lucene release (beyond 2.2.0) will have a major performance enhancement using method -
public TokenStream reusableTokenStream(String fieldName, Reader reader)
We should use that.
Ref: https://issues.apache.org/jira/browse/LUCENE-969
- Author:
- Sijo Cherian [sijocherian at yahoo dot com]
- See Also:
for license details.
The copyright to this program is held by it's authors.
Methods inherited from class org.apache.lucene.analysis.Analyzer |
getPositionIncrementGap, getPreviousTokenStream, reusableTokenStream, setPreviousTokenStream |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
isoLatin1Langs
private static Pattern isoLatin1Langs
SimpleLuceneAnalyzer
public SimpleLuceneAnalyzer()
tokenStream
public org.apache.lucene.analysis.TokenStream tokenStream(String fieldName,
Reader reader)
- Specified by:
tokenStream
in class org.apache.lucene.analysis.Analyzer