[sword-devel] DevTools:ICU & Normalization?

David Haslam dfhmch at googlemail.com
Fri Oct 28 08:28:16 MST 2011

FYI.  As a result of my posts in their forum arising from this topic,
DataMystic have just released v8.9.8 of TextPipe.

The release notes include:

* Updated internal PCRE (Pattern Matching ) engine to v8.13 and support for
Unicode 6.0.0.
* Updated Unicode internal libraries to support Unicode 4.1 for
Normalization etc.

I have confirmed that TextPipe now Normalizes Burmese script to NFC with
identical results to BabelPad.
As an avid user of TextPipe Standard edition, for me this is nice step

Our *BurJudson* module was made with the source text normalized to an
earlier version of Unicode.

Unless one specifies otherwise (by means of the -N switch), osis2mod
performs normalization to NFC.

I would therefore recommend that precompiled SWORD utilities (especially
those for Windows) should be built such that they adhere to the latest
Unicode standard for Normalization.

Likewise, front-end developers may have something to gain by pursuing this
topic further, seeing as ICU has implications during module search, in
regard to normalization of a search string, such that it ought to match how
the module was normalized.


View this message in context: http://sword-dev.350566.n4.nabble.com/DevTools-ICU-Normalization-tp3898398p3948253.html
Sent from the SWORD Dev mailing list archive at Nabble.com.

More information about the sword-devel mailing list