[sword-devel] osis2mod change
chrislit at crosswire.org
Sun Feb 24 14:46:13 MST 2008
DM Smith wrote:
> I have added a -n flag to osis2mod.
I'm going to add it to the other major importers (osis2gbs & imp2*) just
as soon as I get things into a fairly stable state.
> This flag, to be enabled, requires osis2mod to be compiled with ICU
> support enabled.
> -n stands for normalized to NFC, the agreed upon UTF-8 encoding
> When should this flag be used?
> 1) When the input is UTF-8
> 2) It is not known to be NFC
First, I feel like there's really no reason NOT to perform
normalization, provided that the input is UTF-8. Even if the input is
already in NFC, it won't hurt anything to do it again. It will take
extra time to compile the module, but I feel like it's better to be safe
than sorry in this case.
Second, your comment about needing UTF-8 input makes me think we should
go ahead and add encoding conversion to the importers as well, possibly
with automatic charset detection.
More information about the sword-devel