[sword-devel] support for locale codes with region/script subtags
chris at burrell.me.uk
Sun Feb 10 13:26:02 MST 2013
The standard is defined in BCP47 which only supports a '-'. (
as documented by JAVA here:
Java seems to support both a dash and an underscore.
DM, we should ideally be using the Java functionality which supports both,
rather than implementing our own decoding scheme. Not sure what we do/don't
On 10 February 2013 20:09, DM Smith <dmsmith at crosswire.org> wrote:
> We've got this in JSword (not sure it works) for a while now for the next
> release. We used the codes as you've given here. But in the conf file you
> have ur_Deva. We're not expecting an _ but a -. We can change the code.
> Please advise.
> In Him,
> On Feb 10, 2013, at 5:56 AM, Chris Little <chrislit at crosswire.org> wrote:
> > Just a quick heads up:
> > In general, locale codes (the Lang= field of .confs) can have subtags
> that indicate region, script, etc. Ideally these should be dealt with in
> some fashion by front ends since they identify important distinctions (in
> the eyes of the module maker or publisher at least).
> > When unknown subtags are encountered, it's probably best to recursively
> fall back to the tag minus its right-most subtag. For example, if
> 'en-Latn-US' is unknown, fall back to 'en-Latn'. If that is unknown, fall
> back to 'en'. (Hopefully nearly all language subtags are known.)
> > We should handle this in the library, but currently don't. :(
> > As a specific case in point:
> > We now have two Urdu translations. They're the same translation and
> differ in their script (one is Arabic, the other Devanagari). Their
> language codes (as of the 1.2.1 release just made, which corrected the code
> for the Devanagari version) are: ur (Urdu in Arabic script--the usual
> script for Urdu) and ur-Deva (Urdu in Devanagari script).
> > Possible behaviors are to categorize the ur-Deva module as belonging to
> an unknown language (bad), to fall back and categorize it as simply Urdu
> (better, but certainly confusing if the language name is written in Arabic
> and the module is itself written in Devanagari), or to categorize it
> separately as Urdu written in Devanagari (best).
> > For implementers who localize the language name, Urdu written in Arabic
> is written "اردو". Urdu written in Devanagari is written "उर्दू".
> > --Chris
> > _______________________________________________
> > sword-devel mailing list: sword-devel at crosswire.org
> > http://www.crosswire.org/mailman/listinfo/sword-devel
> > Instructions to unsubscribe/change your settings at above page
> sword-devel mailing list: sword-devel at crosswire.org
> Instructions to unsubscribe/change your settings at above page
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the sword-devel