[sword-devel] support for locale codes with region/script subtags
chrislit at crosswire.org
Sun Feb 10 03:56:20 MST 2013
Just a quick heads up:
In general, locale codes (the Lang= field of .confs) can have subtags
that indicate region, script, etc. Ideally these should be dealt with in
some fashion by front ends since they identify important distinctions
(in the eyes of the module maker or publisher at least).
When unknown subtags are encountered, it's probably best to recursively
fall back to the tag minus its right-most subtag. For example, if
'en-Latn-US' is unknown, fall back to 'en-Latn'. If that is unknown,
fall back to 'en'. (Hopefully nearly all language subtags are known.)
We should handle this in the library, but currently don't. :(
As a specific case in point:
We now have two Urdu translations. They're the same translation and
differ in their script (one is Arabic, the other Devanagari). Their
language codes (as of the 1.2.1 release just made, which corrected the
code for the Devanagari version) are: ur (Urdu in Arabic script--the
usual script for Urdu) and ur-Deva (Urdu in Devanagari script).
Possible behaviors are to categorize the ur-Deva module as belonging to
an unknown language (bad), to fall back and categorize it as simply Urdu
(better, but certainly confusing if the language name is written in
Arabic and the module is itself written in Devanagari), or to categorize
it separately as Urdu written in Devanagari (best).
For implementers who localize the language name, Urdu written in Arabic
is written "اردو". Urdu written in Devanagari is written "उर्दू".
More information about the sword-devel