[sword-devel] imp2ld and alphabetization

Frank fchimes at tiscali.co.uk
Sun Oct 28 09:21:45 MST 2007


Daniel Owens wrote:
> I am working on creating dictionary modules based on the Free Vietnamese 
> Dictionary Project. The Vietnamese-English dictionary is working, but 
> some words are not in alphabetical order, and I am trying to find out 
> how to maintain the original alphabetization.
>
> I noticed this when all of the words beginning with a vowel having 
> diacritics/tones or beginning with a "đ" were sorted to the end of the 
> dictionary. The DAT file maintains the original order, which is more 
> accurate. It must be that the IDX file generated by imp2ld creates its 
> own index and alphabetizes according to it's own scheme. The entries of 
> each word are tagged as ThML. Here is a slightly random entry:
>
> $$$ác bá
> <entry key="ác bá" type="main" id="n20"><b>ác bá</b><br />[noun]<br />- 
> Cruel landlord, village tyrant<br /></entry>
>
> Is there a way to keep imp2ld from changing the order of the index? I am 
> happy to send someone the IMP file if that helps. I pasted the CONF file 
> at the bottom of this message.
>
> Daniel
You don't say what OS you use, so I'll have to be a bit general.  To get 
the collation sequence right, you'd have to run the imp2ld program under 
a Vietnamese locale - if you use a European or American locale, it will 
use the sort order of the locale language, which will probably place 
non-ASCII letters after the ASCII ones.

I hope that helps...  :@)

-- 
Blessings

Frank




More information about the sword-devel mailing list