[sword-devel] TEI formatting, duplicated key (BDB Glosses)
dmsmith at crosswire.org
Mon Apr 30 07:00:02 MST 2012
On 04/30/2012 09:37 AM, Daniel Owens wrote:
> On 04/30/2012 06:54 AM, Chris Little wrote:
>> On 4/30/2012 4:39 AM, David Troidl wrote:
>>> Hi Chris,
>>> I'm certainly no expert on your TEI dictionaries, but wouldn't it make
>>> sense to have the first key be one that would sort properly, and
>>> the dictionary in true alphabetical order? I'm thinking of Middle
>>> Liddell, as well as the Hebrew. This key wouldn't even necessarily have
>>> to be shown to the user. The second key, the title, could then maintain
>>> the proper accents for display, without hindering sorting, searching or
>> I confess, I don't understand what you're proposing this as an
>> alternative to.
>> In the example Karl cites, there's just one actual key per entry. It
>> is an uppercased version of the entryFree's n attribute. This is the
>> key that is sorted.
>> The un-uppercased version from the n attribute is being rendered as
>> part of the entry text via the TEI filters. This is the part I'm
>> proposing we retain, but render somewhere else, e.g. right-justified
>> at the bottom of the entry.
>> We also render all the text of the entry, which in these cases
>> includes the text from a title element.
>> I don't know what 'true alphabetical order' means, but if you mean
>> localized sort order, it's not possible with the current
>> implementation of this module type.
> I think David's concern is something that needs to be dealt with. A
> number of possibilities could be pursued, some of them together:
> 1. The current implementation is to sort by unicode code points.
> This works particularly well with numeric keys. A quick solution for
> languages for which such sorting is not alphabetical would be to
> follow David's suggestion of using keys that the user does not even
> see. This has the advantage of providing a workable solution right
> away, but there are some problems with this. First, we could create a
> new "strongs" standard because the current implementation does not
> actually hide keys. That could be solved by making the keys so obscure
> that no one would remember them. Second, any future, more robust
> solution would require reworking all modules keyed to it. I have toyed
> with this solution, and it might be the pragmatic way forward, but it
> is not ideal.
> 2. A localized sort order, which I think this is what David means
> by true alphabetical order, would be a better long-term solution.
> 3. In addition, using genbooks for lexica would work for lexica
> that are sorted by root, with subentries nested in a hierarchy, just
> like in the Hesychius module and BDB. I have been working with Troy on
> this. Unfortunately, front-ends do not recognize the Feature=HebrewDef
> option in the conf file and allow genbooks as lexica. I can send
> anyone an example lexicon if you are interested in working on this. In
> that case, instead of @n as the key, */x-entry/@osisID would be the key.
> Any thoughts?
I think there is a problem with the sorting of entries in dictionaries
where the keys are not ascii. I don't remember the details, but I seem
to remember it having been discussed here.
For JSword, we'll be building a Lucene search index for the key, the
term and the whole entry. A user lookup will be normalized and the
search will return the key with which lookup will proceed internally as
it does today. ICU provides the ability to create a localized sort key
(not at all suitable for display) that can be used to sort dictionary
entries for the end-users locale. I'm thinking that for TEI dictionaries
the representation of the key should not be shown at all.
From what I can remember, this will solve all the issues.
More information about the sword-devel