[sword-devel] Locale differences

Greg Hellings greg.hellings at gmail.com
Tue Sep 11 17:27:49 MST 2012

On Mon, Sep 10, 2012 at 8:01 PM, DM Smith <dmsmith at crosswire.org> wrote:
> This is a hard question. And a good one.
> For the record (not saying it is right or that it is best or even good) here is how JSword does it:
> It does not use the Locale of the module.
> It uses the OSIS book names first. (Assumes that the majority of Book Names come from OSIS modules)
> Then it uses the user's Locale. (Assumes that the Locale is meaningful for the user.)
> Finally it uses English book names. (Fall back for ThML and GBF and for consistency.)

When you say "OSIS book names first" what are you referring to? Are
you referring to the specific abbreviations that are the OSIS spec? If
so, then it looks like you might be following the same pattern as
Xiphos, which allows the user's Locale to be used.

> You didn't mention what happens if you have an inexact match. I'd have to look in the JSword code to see what happens if it doesn't find the input. The fall back is to look for the input as a proper prefix or as an alternate name or abbreviation. But I forget whether it looks for an exact match first across the three dictionaries and then for proper prefixes in the latter two and then for the alternate name in the latter two. Or whether it exhausts one level before the next. JSword doesn't order like SWORD does. It prioritizes first by NT, then by OT, and within the testament by book order.
> The results of inexact matches depend on the ordering of the matching algorithm.

I'd have to test more. I know that, if the mismatch is drastic enough
the verse gets "punted" to Revelation 1:1. BibleTime therefore takes
you to Revelation 1:1, while Xiphos seems to intercept an error code
on the VerseKey somewhere and just leaves you right where you started.
Based on that behavior, both of them likely behave the same as they
are relying on SWORD's parsing algorithm - just with different default
locales set.

> We've thought about using the module's language. And we/ve thought about doing lookups over all dictionaries of Book Names. But what order?
> Just not clear on what actually makes sense.

I like the idea mentioned of setting it as an option. It could even be
made a per-module configuration option at the application's discretion
and the default value for it could be specified. For my own purposes I
want to use my locale (English) almost all of the time, unless I'm
testing a module like the current scenario. Since my locale is
English, I am spoiled as that is included in the search by default,
but if it weren't I would still want to use that the majority of the
time. I use it when I pull up Greek modules (which is my main foreign
language module set).

The behavior I expected was that the default would be the module with
my locale as the fallback. Neither Xiphos nor BibleTime behaves that
way, except when considering the English locale (and then, only
BibleTime does). I have no basis for that expectation other than my
off-the-cuff reaction to, "Huh, wonder why that didn't parse the
module locale".

For single-use applications like Diatheke or such, I would assume it
should default to using the module's locale. So I guess I'm saying I
would expect the engine to specify that as the default. I'm not sure
if it does or not.

> On a side note, the locale we use:
> The locale of the chosen translation. (It might make sense to also have a list of known languages that the user can select and order. But we don't do that.)
> If that is not set, then we use the locale that the user chooses via their OS and is exposed in Java as Locale.getDefaultLocale();
> We don't use the LANG environment variable unless that is the OS mechanism.

An interesting, but understandable, default I think.


> In Him,
>         DM
> On Sep 10, 2012, at 3:44 PM, Greg Hellings <greg.hellings at gmail.com> wrote:
>> Just wanted to note here some differences between Xiphos and BibleTime
>> locale handling.
>> Setup:
>> I'm working with a new, minority language translation. The language is
>> Takwane with the language code abbreviation "tke". I have successfully
>> created a module which has the conf file entry "Lang=tke" and began to
>> note some oddities about locale handling. For ease of reading further,
>> "Wambeela" is the Takwane name for Genesis and "1. Mose" is how the
>> book name appears in our German locale.
>> Xiphos:
>> In Xiphos, when I start the application with my default locale of
>> LANG=en_US.UTF-8 and open the Takwane module, the application properly
>> understands only the English names of books and ignores the Takwane.
>> That is, I can type in "Genesis 2:1" and be properly navigated to that
>> position but entering "Wambeela 2:1" causes the application to ignore
>> my input. To test, I started the application with LANG=de, and I could
>> type EITHER "Genesis 2:1" or "1. Mose 2:1" and I would navigate to the
>> appropriate passage. If I started the application with LANG=tke I
>> could enter either "Genesis" or "Wambeela". Thus, Xiphos ignores the
>> Lang setting on the module and only understands the LANG environment
>> variable.
>> BibleTime:
>> In BibleTime I started the application with my en_US.UTF-8 locale and
>> opened the Takwane module. Here, the module understood both "Genesis"
>> and "Wambeela". Setting LANG=de and restarting the application causes
>> it to still understand "Genesis" and "Wambeela" but it can't grasp "1.
>> Mose" and instead punts me to "Rev 1:1" for a parsing error.
>> Diatheke:
>> Appears to ignore the LANG variable, but cannot parse the module's
>> address without using the "-l tke" switch.
>> So it appears that the engine will always comprehend English book
>> names and that BibleTime is somehow honoring the module's Lang setting
>> but ignoring its own UI setting while Xiphos is honoring the
>> UI/environment setting but ignoring the module's Lang setting.
>> I just wanted to put that out here, so there is a record of it and so
>> developers for either app can think about the UX they want. In the
>> case of Takwane, since neither application has a Takwane locale it is
>> likely the users will try for Portugese in the application's UI but
>> will still want to type their native Takwane book names. This makes
>> Xiphos' UX undesirable as it only understands English and whatever
>> locale the UI is in. But presumably a user might want to open a module
>> in a different language and still be able to use their native locale
>> (like us English speakers are probably used to doing since the engine
>> appears to understand English all the time). This makes BibleTime's UX
>> bad because it seems to ignore the UI's locale.
>> I'm unsure of a path to take when recommending an application to the
>> translators for testing because of this. Both situations could be
>> awkward, unless they eventually decide it is worth the effort to
>> translate the UI itself into Takwane.
>> --Greg
>> _______________________________________________
>> sword-devel mailing list: sword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/sword-devel
>> Instructions to unsubscribe/change your settings at above page
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page

More information about the sword-devel mailing list