[sword-devel] Accented Greek Texts

Troy A. Griffitts scribe at crosswire.org
Tue Sep 18 14:53:27 MST 2007


W3C has decided on NFC (mostly).  Here is a good FAQ item:
http://unicode.org/faq/normalization.html#7

I think we're unanimous on this one.  Let's go with NFC.

Regarding lucene indexing.  Currently we strip all accents and also critical
transcription markings before indexing.  When the user searches in these
texts, we also apply the same filters to their search string.

Pertinent filters:

http://crosswire.org/svn/sword/trunk/src/modules/filters/papyriplain.cpp
http://crosswire.org/svn/sword/trunk/src/modules/filters/utf8greekaccents.cpp



Eeli Kaikkonen <eekaikko at mail.student.oulu.fi> wrote: 
>On Tue, 18 Sep 2007, DM Smith wrote:
>> > If we allow variation, yes. But I would suggest we just pick a
>> > normalization (NFD or NFC) and stick with it for all modules.
>> >
>>
>> So would I. I vote for NFC as it is a bit more compact. And these
>> modules would work as is:)
>
>I agree.
>
>> I don't think that requiring ICU for osis2mod is onerous. After all it
>> is just a utility and not a front end.
>
>This one too.
>
>
>  Yours,
>        Eeli Kaikkonen (Mr.), Oulu, Finland
>        e-mail: eekaikko at mailx.studentx.oulux.fix (with no x)
>
>_______________________________________________
>sword-devel mailing list: sword-devel at crosswire.org
>http://www.crosswire.org/mailman/listinfo/sword-devel
>Instructions to unsubscribe/change your settings at above page
>
>





More information about the sword-devel mailing list