[sword-devel] imp2ld and alphabetization

Chris Little chrislit at crosswire.org
Sun Oct 28 11:24:55 MST 2007


Daniel,

The order of keys in an LD module is according to the codepoint order in 
Unicode. They keys are kept in this order in order to permit binary 
searching. There is currently no way to perform localized collation.

The platform and locale shouldn't play a role in this. If they do, it's 
a bug.

--Chris

Daniel Owens wrote:
> I am working on creating dictionary modules based on the Free Vietnamese 
> Dictionary Project. The Vietnamese-English dictionary is working, but 
> some words are not in alphabetical order, and I am trying to find out 
> how to maintain the original alphabetization.
> 
> I noticed this when all of the words beginning with a vowel having 
> diacritics/tones or beginning with a "Ä‘" were sorted to the end of the 
> dictionary. The DAT file maintains the original order, which is more 
> accurate. It must be that the IDX file generated by imp2ld creates its 
> own index and alphabetizes according to it's own scheme. The entries of 
> each word are tagged as ThML. Here is a slightly random entry:
> 
> $$$ác bá
> <entry key="ác bá" type="main" id="n20"><b>ác bá</b><br />[noun]<br />- 
> Cruel landlord, village tyrant<br /></entry>
> 
> Is there a way to keep imp2ld from changing the order of the index? I am 
> happy to send someone the IMP file if that helps. I pasted the CONF file 
> at the bottom of this message.
> 
> Daniel
> 
> CONF File:
> 
> [VietAnh]
> DataPath=./modules/lexdict/rawld4/vietanh/vietanh
> ModDrv=RawLD4
> Encoding=UTF-8
> SourceType=THML
> SwordVersionDate=2007-10-27
> Version=1.0
> Lang=vi
> Description=FVDP Vietnamese-English Dictionary
> About=- This is the Vietnamese-English dictionary database of the Free 
> Vietnamese Dictionary Project. It contains more than 23.400 entries with 
> definitions and illustrative examples.\par\par- This database was 
> compiled by Ho Ngoc Duc and other members of the Free Vietnamese 
> Dictionary Project 
> (http://www.informatik.uni-leipzig.de/~duc/Dict/)\par\par- Copyright (C) 
> 1997-2003 The Free Vietnamese Dictionary Project\par\par- This program 
> is free software; you can redistribute it and/or modify it under the 
> terms of the GNU General Public License as published by the Free 
> Software Foundation; either version 2 of the License, or (at your 
> option) any later version. This program is distributed in the hope that 
> it will be useful, but WITHOUT ANY WARRANTY; without even the implied 
> warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 
> GNU General Public License for more details.
> TextSource=http://www.informatik.uni-leipzig.de/~duc/Dict/
> 
> 
> 
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page



More information about the sword-devel mailing list