[sword-devel] Re: Westcott-Hort

Costas I. Stergiou sword-devel@crosswire.org
Tue, 6 Apr 2004 09:56:25 +0300 (EET DST)


> I'm not sure I understand all of this, but practically, it's much less
> important to speedily process these characters when displaying, versus
> searching.  I would personally like decomposed characters stored for
> less processing during a scan of the text.  But you guys are the experts.

Hi Troy,
actually the link that Chris sent yesterday proposes to use precomposed
characters for many reasons. First, precomposed chars takes about 30% less
space. Also, most systems prefer them for display (there are many more
reasons argued in the doc). I don't think that the
searching is affected by either of these forms. Anyway, a simple ICU
filter can convert in either way on the fly. This is just the final touch
to the normalization procedure...

Costas


> 	-Troy.
>
>
>
> Chris Little wrote:
> > Costas I. Stergiou wrote:
> >
> >> Actually, the NFC standard is all about precomposed chars. All the
> >> extended
> >> greek chars are exactly this: the (pre-composed) greek letters with the
> >> diacriticals. I use icu4j for all my tests & conversions and when
> >> asking to take
> >> a text and convert it to NFC it does use the extended greek chars. So, my
> >> almost certain answer, is yes (extended greek is NFC)
> >
> >
> > Costas,
> >
> > It sounds like you know what you're doing.  My only concern was that the
> >   Greek Extended area was categorized as compatability or presentation,
> > in which case they might not be canonically equivalent to decomposed
> > codepoint sequences.  But if you're doing NFC normalization already,
> > then obviously they are.
> >
> > --Chris
> >
> > _______________________________________________
> > sword-devel mailing list
> > sword-devel@crosswire.org
> > http://www.crosswire.org/mailman/listinfo/sword-devel
>
> _______________________________________________
> sword-devel mailing list
> sword-devel@crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
>