Package org.crosswire.jsword.index.lucene.analysis

Implementation of various Lucene analyzers, providing language dependent customizations.

See:
          Description

Class Summary
AbstractBookAnalyzer Base class for Analyzers.
AbstractBookTokenFilter An AbstractBookTokenFilter ties a Lucene TokenFilter to a Book.
AnalyzerFactory A factory creating the appropriate Analyzer for natural language analysis of text for Lucene Indexing and Query Parsing.
ArabicLuceneAnalyzer An Analyzer whose TokenStream is built from a ArabicLetterTokenizer filtered with LowerCaseFilter, ArabicNormalizationFilter, ArabicStemFilter (optional) and Arabic StopFilter (optional).
ChineseLuceneAnalyzer Uses org.apache.lucene.analysis.cn.ChineseAnalyzer Analysis: ChineseTokenizer, ChineseFilter StopFilter, Stemming not implemented yet Note: org.apache.lucene.analysis.cn.CJKAnalyzer takes overlapping two character tokenization approach which leads to larger index size.
ConfigurableSnowballAnalyzer An Analyzer whose TokenStream is built from a LowerCaseTokenizer filtered with SnowballFilter (optional) and StopFilter (optional) Default behavior: Stemming is done, Stop words not removed A snowball stemmer is configured according to the language of the Book.
CzechLuceneAnalyzer An Analyzer whose TokenStream is built from a LowerCaseTokenizer filtered with StopFilter (optional).
EnglishLuceneAnalyzer English Analyzer works like lucene SimpleAnalyzer + Stemming.
GermanLuceneAnalyzer Based on Lucene's GermanAnalyzer
GreekLuceneAnalyzer Uses org.apache.lucene.analysis.el.GreekAnalyzer to do lowercasing and stopword(off by default).
KeyAnalyzer A specialized analyzer that normalizes Strong's Numbers.
KeyFilter A KeyFilter normalizes Key.
LuceneAnalyzer A specialized analyzer for Books that analyzes different fields differently.
MorphologyAnalyzer Robinson Morphological Codes are separated by whitespace.
PersianLuceneAnalyzer An Analyzer whose TokenStream is built from a ArabicLetterTokenizer filtered with LowerCaseFilter, ArabicNormalizationFilter, PersianNormalizationFilter and Persian StopFilter (optional)
SavedStreams SavedStreams is used to make reusable Lucene analyzers.
SimpleLuceneAnalyzer Simple Analyzer providing same function as org.apache.lucene.analysis.SimpleAnalyzer This is intended to be the default analyzer for natural language fields.
SmartChineseLuceneAnalyzer A simple wrapper for SmartChineseAnalyzer, which takes overlapping two character tokenization approach which leads to larger index size, like org.apache.lucene.analyzer.cjk.CJKAnalyzer.
StrongsNumberAnalyzer A specialized analyzer that normalizes JSword keys.
StrongsNumberFilter A StrongsNumberFilter normalizes Strong's Numbers.
ThaiLuceneAnalyzer Tokenization using ThaiWordFilter.
XRefAnalyzer A specialized analyzer that normalizes Cross References.
XRefFilter A KeyFilter normalizes OSISrefs.
 

Package org.crosswire.jsword.index.lucene.analysis Description

Implementation of various Lucene analyzers, providing language dependent customizations.


Copyright ยจ 2003-2015