[sword-devel] French ligatures in Louis SÉGOND’s text
Troy A. Griffitts
scribe at crosswire.org
Mon Jul 16 15:57:34 MST 2007
Just a quick note. Our lucene indexing code does call all our strip
filters. The solution and example I provided in my last email was using
Chris Little <chrislit at crosswire.org> wrote:
>DM Smith wrote:
>> Doesn't ICU have locale sensitive decomposition (or transliteration)?
>> If it does then why can't we use the language of the module to set
>> the locale then decompose. This is what we are planning to do for
>> JSword (it has been on the todo list for years).
>I don't see anything like this in ICU. I couldn't find anything in the
>API docs and there's nothing in the locale files themselves.
>I think our best option may be to tag words on a per module basis with
>alternative forms and then index the forms as alternates with Lucene, as
> your last post suggested. For non-Lucene searches we can normalize the
>text & search strings via the strip filters as Troy suggests.
>Someone else would have to provide the code side of things, but in terms
>of markup, I think we just want to do something along the lines of:
>And the strip filter (for non-Lucene searches) will just replace that
>sword-devel mailing list: sword-devel at crosswire.org
>Instructions to unsubscribe/change your settings at above page
More information about the sword-devel