[sword-devel] TEI formatting, duplicated key (BDB Glosses)

DM Smith dmsmith at crosswire.org
Mon Apr 30 07:00:02 MST 2012

On 04/30/2012 09:37 AM, Daniel Owens wrote:
> On 04/30/2012 06:54 AM, Chris Little wrote:
>> On 4/30/2012 4:39 AM, David Troidl wrote:
>>> Hi Chris,
>>> I'm certainly no expert on your TEI dictionaries, but wouldn't it make
>>> sense to have the first key be one that would sort properly, and 
>>> present
>>> the dictionary in true alphabetical order? I'm thinking of Middle
>>> Liddell, as well as the Hebrew. This key wouldn't even necessarily have
>>> to be shown to the user. The second key, the title, could then maintain
>>> the proper accents for display, without hindering sorting, searching or
>>> navigation.
>> I confess, I don't understand what you're proposing this as an 
>> alternative to.
>> In the example Karl cites, there's just one actual key per entry. It 
>> is an uppercased version of the entryFree's n attribute. This is the 
>> key that is sorted.
>> The un-uppercased version from the n attribute is being rendered as 
>> part of the entry text via the TEI filters. This is the part I'm 
>> proposing we retain, but render somewhere else, e.g. right-justified 
>> at the bottom of the entry.
>> We also render all the text of the entry, which in these cases 
>> includes the text from a title element.
>> I don't know what 'true alphabetical order' means, but if you mean 
>> localized sort order, it's not possible with the current 
>> implementation of this module type.
>> --Chris
> I think David's concern is something that needs to be dealt with. A 
> number of possibilities could be pursued, some of them together:
>     1. The current implementation is to sort by unicode code points. 
> This works particularly well with numeric keys. A quick solution for 
> languages for which such sorting is not alphabetical would be to 
> follow David's suggestion of using keys that the user does not even 
> see. This has the advantage of providing a workable solution right 
> away, but there are some problems with this. First, we could create a 
> new "strongs" standard because the current implementation does not 
> actually hide keys. That could be solved by making the keys so obscure 
> that no one would remember them. Second, any future, more robust 
> solution would require reworking all modules keyed to it. I have toyed 
> with this solution, and it might be the pragmatic way forward, but it 
> is not ideal.
>     2. A localized sort order, which I think this is what David means 
> by true alphabetical order, would be a better long-term solution.
>     3. In addition, using genbooks for lexica would work for lexica 
> that are sorted by root, with subentries nested in a hierarchy, just 
> like in the Hesychius module and BDB. I have been working with Troy on 
> this. Unfortunately, front-ends do not recognize the Feature=HebrewDef 
> option in the conf file and allow genbooks as lexica. I can send 
> anyone an example lexicon if you are interested in working on this. In 
> that case, instead of @n as the key, */x-entry/@osisID would be the key.
> Any thoughts?

I think there is a problem with the sorting of entries in dictionaries 
where the keys are not ascii. I don't remember the details, but I seem 
to remember it having been discussed here.

For JSword, we'll be building a Lucene search index for the key, the 
term and the whole entry. A user lookup will be normalized and the 
search will return the key with which lookup will proceed internally as 
it does today. ICU provides the ability to create a localized sort key 
(not at all suitable for display) that can be used to sort dictionary 
entries for the end-users locale. I'm thinking that for TEI dictionaries 
the representation of the key should not be shown at all.

 From what I can remember, this will solve all the issues.

In Him,

More information about the sword-devel mailing list