[sword-devel] Normalising on the commandline
Peter von Kaehne
refdoc at gmx.net
Wed Jan 21 13:44:45 MST 2009
Peter von Kaehne wrote:
> Chris Little wrote:
>> uconv -f utf-8 -t utf-8 -x NFC -o output input
> Thanks a lot!
Unfortunately learned in the process that my problems with search are
not caused by lack of normalisation, but by inconsistent encoding -
there are three different Arabic/Farsi Unicode letters which look and
(largely) behave the same way for ی . But they cause a mess during
search. It is as if the letter I had a different code point for German
than for English and yet another one for French. So if you type in a
German word on an English keyboard it suddenly would not find it.
Whoever implemented Unicode for Arabic script has a lot to answer for!!
More information about the sword-devel