[sword-devel] imp2ld and alphabetization

Chris Little chrislit at crosswire.org
Mon Oct 29 10:32:06 MST 2007

DM Smith wrote:
> On Oct 29, 2007, at 12:49 AM, Chris Little wrote:
>> It's possible to have multiple keys share a single entry. So  
>> pointed and
>> an unpointed keys can point to the same entry. We've done this
>> experimentally with dictionaries in the past to permit lookup by a
>> Strong's number or the lemma it represents.
> That works but then all current front-ends would show two entries.

I hadn't considered that a problem, but it certainly could be, if we had 
a large quantity of similar-looking keys intermixed. I suppose we could 
either tag some keys to not display in the index or we could add a 
module attribute to suppress display of all link entries.

>>> A user may expect to find a word by stem not just by prefix.
>> I'm not sure whether this is a sort order issue or lookup/search  
>> issue.
>> Presumably a user would know the word they want and type it in with  
>> its
>> prefix, even if it is sorted to group with other words sharing the  
>> same
>> stem.
> Maybe I am not using the right terminology. Let's say that "run" is  
> in the dictionary but "ran" is not because this dictionary only has  
> the base words and no grammatical variations. Now the user right  
> clicks on "ran" and chooses lookup and is brought to the nearest word  
> to "ran", perhaps "rabid". This is a simple case. It has been quite a  
> while since I studied other languages, but I seem to remember that  
> German changes the prefix of words when going to the past tense. And  
> in Greek, I seem to remember diacritic changes and suffix changes.

That's what I understood you to mean. I think our first goal should be 
to maintain the provided keys in their provided order. A German 
dictionary won't necessarily list the past participle forms (the ones 
that begin with ge-) unless they are irregular, and then their entry 
will basically just say "pa. ptc. of ______en".

I'm not sure of your experiences with learning languages, but in mine, 
one of the first things you learn is how to look up words in a 
dictionary. That means learning to figure out the infinitive of a verb 
(as with German(-ic)) or the first person singular present indicative 
(as with Latin), the nominative of a noun (as in most languages with 
case), the radical (as in Chinese), the root (as in semitic), the stem 
(as in Germanic & probably Greek). Presenting the citation form is all 
we necessarily need to do.

In practice, all we should do is present what the source gives us, even 
if it's in a strange order.

>> I'm willing to write these users off. We could transliterate back to
>> Greek, but I don't think it's worth the effort or processor cycles. I
>> don't believe that people who don't know how to read Greek use Greek
>> lexicons other than as a novelty.
> I was thinking altogether of a different user. For example I use  
> Windows, Linux and Macs almost daily and I do not want to learn each  
> OSes input system and just wants to find words by typing (like Beta  
> Greek) It is not a matter of reading but of entry.

I don't think it's a problem to be solved on the module side (or in the 
module drivers).

We have some InputMethod classes, which could be used at least for the 
major cases where people might know a language but not know how to type 
it (Greek & Hebrew). It would also be possible to run key entry through 
an ICU transliterator to get another script.


More information about the sword-devel mailing list