[sword-devel] search failing in Hebrew modules
ransom1982 at gmail.com
Thu Jul 30 19:49:51 MST 2009
> Couple of thoughts
> Assuming the search is a Lucene search.
Results are nearly the same whether it's indexed or not.
> Unicode can have multiple possible representations (byte sequences) for a
> single decorated character. Search will work only if the request and index
> The index has a single representation of the text. The analyzer assumes
> English as input and applies all kinds of transforms that may not be
> appropriate for Hebrew.
> When a search is performed the same analyzer is used to transform the search
> request. Generally this is sufficient to ensuer that the search will work.
> If the search request is not or is not transformed first into the same
> Unicode representation, then the search will fail as it will not form the
> stored byte sequence. Typically copy of displayed text for a search request
> will work. Typically typed input will fail. It is just too difficult to type
> the same stored text.
> IIRC, SWORD will use the current filters (e.g. Remove accents) in building
> the index. Searches that don't apply the same filters to the request as used
> to build the index will fail.
So is there a way to index a module with and without vowels? Or search
(non-indexed) that way? That seems to be a common request for Semitic
btw, diatheke strangely works exactly the opposite. Searches for words
without vowels work fine, whereas searches with vowels don't work at
More information about the sword-devel