[sword-devel] Locale differences
greg.hellings at gmail.com
Tue Sep 11 21:24:09 MST 2012
Update to note: Apparently BibleTime has a setting in its
configuration that allows the user to specifically select which
language Bible book names will be displayed in. This is used across
all parsing and display.
On Tue, Sep 11, 2012 at 9:08 PM, DM Smith <dmsmith at crosswire.org> wrote:
> On Sep 11, 2012, at 8:27 PM, Greg Hellings <greg.hellings at gmail.com> wrote:
>> On Mon, Sep 10, 2012 at 8:01 PM, DM Smith <dmsmith at crosswire.org> wrote:
>>> This is a hard question. And a good one.
>>> For the record (not saying it is right or that it is best or even good) here is how JSword does it:
>>> It does not use the Locale of the module.
>>> It uses the OSIS book names first. (Assumes that the majority of Book Names come from OSIS modules)
>>> Then it uses the user's Locale. (Assumes that the Locale is meaningful for the user.)
>>> Finally it uses English book names. (Fall back for ThML and GBF and for consistency.)
>> When you say "OSIS book names first" what are you referring to? Are
>> you referring to the specific abbreviations that are the OSIS spec? If
>> so, then it looks like you might be following the same pattern as
>> Xiphos, which allows the user's Locale to be used.
> I mean the specific abbreviations that OSIS defines for each book. More and more references in modules use OSIS book names, including ThML. So using this first is a reasonable optimization.
> I also didn't mention that JSword normalizes the input book name and compares it to normalized names in the various book name dictionaries. I forget what all that the normalization does, but I do remember it strips spaces and converts to a consistent case.
>>> You didn't mention what happens if you have an inexact match. I'd have to look in the JSword code to see what happens if it doesn't find the input. The fall back is to look for the input as a proper prefix or as an alternate name or abbreviation. But I forget whether it looks for an exact match first across the three dictionaries and then for proper prefixes in the latter two and then for the alternate name in the latter two. Or whether it exhausts one level before the next. JSword doesn't order like SWORD does. It prioritizes first by NT, then by OT, and within the testament by book order.
>>> The results of inexact matches depend on the ordering of the matching algorithm.
>> I'd have to test more. I know that, if the mismatch is drastic enough
>> the verse gets "punted" to Revelation 1:1. BibleTime therefore takes
>> you to Revelation 1:1, while Xiphos seems to intercept an error code
>> on the VerseKey somewhere and just leaves you right where you started.
>> Based on that behavior, both of them likely behave the same as they
>> are relying on SWORD's parsing algorithm - just with different default
>> locales set.
> I've never liked the error picking a verse that is far away from the user's intent. I think JSword still has places that use Gen 1:1.
SWORD apparently has the Error()/setError() set of methods. BibleTime
simply doesn't use them, either because the original developers didn't
know about them or because they're newer than BibleTime's
implementation of its selector. Xiphos, based on its behavior, honors
the value from those methods.
>>> We've thought about using the module's language. And we/ve thought about doing lookups over all dictionaries of Book Names. But what order?
>>> Just not clear on what actually makes sense.
>> I like the idea mentioned of setting it as an option. It could even be
>> made a per-module configuration option at the application's discretion
>> and the default value for it could be specified. For my own purposes I
>> want to use my locale (English) almost all of the time, unless I'm
>> testing a module like the current scenario. Since my locale is
>> English, I am spoiled as that is included in the search by default,
>> but if it weren't I would still want to use that the majority of the
>> time. I use it when I pull up Greek modules (which is my main foreign
>> language module set).
> So do you have to input it with the diacritics? (may be Greek doesn't have them in SWORD but French would.)
I've never used Greek to input the reference. I've always used English
to input, which is always understood by the C engine and is the
primary output understood by me.
>> The behavior I expected was that the default would be the module with
>> my locale as the fallback. Neither Xiphos nor BibleTime behaves that
>> way, except when considering the English locale (and then, only
>> BibleTime does). I have no basis for that expectation other than my
>> off-the-cuff reaction to, "Huh, wonder why that didn't parse the
>> module locale".
>> For single-use applications like Diatheke or such, I would assume it
>> should default to using the module's locale. So I guess I'm saying I
>> would expect the engine to specify that as the default. I'm not sure
>> if it does or not.
>>> On a side note, the locale we use:
>>> The locale of the chosen translation. (It might make sense to also have a list of known languages that the user can select and order. But we don't do that.)
>>> If that is not set, then we use the locale that the user chooses via their OS and is exposed in Java as Locale.getDefaultLocale();
>>> We don't use the LANG environment variable unless that is the OS mechanism.
>> An interesting, but understandable, default I think.
> What do you use for the manufactured display of a reference?
> For example, in verse lists which locale do you use? I would imagine it would be that of the translation of the UI.
BibleTime uses its independent setting for Bible/Book names. Xiphos
uses the language of the LANG environment variable (which is not
necessarily the language of the UI - that is set independently from
within the application).
>>> In Him,
>>> On Sep 10, 2012, at 3:44 PM, Greg Hellings <greg.hellings at gmail.com> wrote:
>>>> Just wanted to note here some differences between Xiphos and BibleTime
>>>> locale handling.
>>>> I'm working with a new, minority language translation. The language is
>>>> Takwane with the language code abbreviation "tke". I have successfully
>>>> created a module which has the conf file entry "Lang=tke" and began to
>>>> note some oddities about locale handling. For ease of reading further,
>>>> "Wambeela" is the Takwane name for Genesis and "1. Mose" is how the
>>>> book name appears in our German locale.
>>>> In Xiphos, when I start the application with my default locale of
>>>> LANG=en_US.UTF-8 and open the Takwane module, the application properly
>>>> understands only the English names of books and ignores the Takwane.
>>>> That is, I can type in "Genesis 2:1" and be properly navigated to that
>>>> position but entering "Wambeela 2:1" causes the application to ignore
>>>> my input. To test, I started the application with LANG=de, and I could
>>>> type EITHER "Genesis 2:1" or "1. Mose 2:1" and I would navigate to the
>>>> appropriate passage. If I started the application with LANG=tke I
>>>> could enter either "Genesis" or "Wambeela". Thus, Xiphos ignores the
>>>> Lang setting on the module and only understands the LANG environment
>>>> In BibleTime I started the application with my en_US.UTF-8 locale and
>>>> opened the Takwane module. Here, the module understood both "Genesis"
>>>> and "Wambeela". Setting LANG=de and restarting the application causes
>>>> it to still understand "Genesis" and "Wambeela" but it can't grasp "1.
>>>> Mose" and instead punts me to "Rev 1:1" for a parsing error.
>>>> Appears to ignore the LANG variable, but cannot parse the module's
>>>> address without using the "-l tke" switch.
>>>> So it appears that the engine will always comprehend English book
>>>> names and that BibleTime is somehow honoring the module's Lang setting
>>>> but ignoring its own UI setting while Xiphos is honoring the
>>>> UI/environment setting but ignoring the module's Lang setting.
>>>> I just wanted to put that out here, so there is a record of it and so
>>>> developers for either app can think about the UX they want. In the
>>>> case of Takwane, since neither application has a Takwane locale it is
>>>> likely the users will try for Portugese in the application's UI but
>>>> will still want to type their native Takwane book names. This makes
>>>> Xiphos' UX undesirable as it only understands English and whatever
>>>> locale the UI is in. But presumably a user might want to open a module
>>>> in a different language and still be able to use their native locale
>>>> (like us English speakers are probably used to doing since the engine
>>>> appears to understand English all the time). This makes BibleTime's UX
>>>> bad because it seems to ignore the UI's locale.
>>>> I'm unsure of a path to take when recommending an application to the
>>>> translators for testing because of this. Both situations could be
>>>> awkward, unless they eventually decide it is worth the effort to
>>>> translate the UI itself into Takwane.
>>>> sword-devel mailing list: sword-devel at crosswire.org
>>>> Instructions to unsubscribe/change your settings at above page
>>> sword-devel mailing list: sword-devel at crosswire.org
>>> Instructions to unsubscribe/change your settings at above page
>> sword-devel mailing list: sword-devel at crosswire.org
>> Instructions to unsubscribe/change your settings at above page
> sword-devel mailing list: sword-devel at crosswire.org
> Instructions to unsubscribe/change your settings at above page
More information about the sword-devel