[sword-devel] LANG values in sword?

Martin Gruner sword-devel@crosswire.org
Sun, 7 Dec 2003 16:02:47 +0100


This will be new in ICU 2.8, not sure if it would help us:

http://oss.software.ibm.com/icu/download/2.8/index.html

-Locale IDs-

The format for locale IDs is extended to allow the use of optional script 
codes ("sr-Latn-YU") and service-specific keywords 
("de@collation=phonebook"). Where possible, these should be used instead of 
the less-specific locale variants. (There is a proposed update for RFC 3066 
to allow for script codes and other improvements.)


mg

Am Sonntag, 7. Dezember 2003 00:24 schrieb Chris Little:
> Hugo van der Kooij wrote:
> > Hi,
> >
> > I know I reported that sword is not handling longer versions of the LANG
> > environment variable.
> >
> > Could someone point me to the correct URL where the usage of the LANG
> > variable is defined as only two characters?
>
> The system for assigning lang values used by Sword files was essentially
> designed by me and is more or less what we adopted for OSIS.  (There are
> a few differences that will be fixed, but they only affect minority
> languages that none of you can speak or read.)  I need to do a write up
> for assigning them, but basically the system is this:
>
> Any language should be represented by a single unique code.
> Its format should match that described by IETF RFC 3066. (So ISO 639-1 &
> ISO 639-2 codes and IANA registered codes are all valid.  Plus you can
> use SIL Ethnologue codes or LINGUIST List codes if you preface them with
> "x-SIL-" and "x-LINGUIST-" respectively.  Also, country codes are
> permissible, when they are applicable.)
> Since these code systems have considerable overlap, you should choose
> the shortest code that describes the language with the greatest
> specificity.  (Hence, Ancient Greek would be "grc", not "el", which is
> Modern Greek.  And there might be instances where a group of languages
> are covered by an ISO 639-2 code, in which case a more specific SIL code
> would probably be better.)
> Country codes are almost never necessary.  The only instances where they
> are relevent are between English spoken in the US, UK, etc. and between
> Chinese written in Taiwan and mainland China.
>
> >>From my reading on
> >
> > http://www.opengroup.org/onlinepubs/007904975/basedefs/xbd_chap08.html
> >
> > I can only conclude that nl_NL.UTF-8 is a valid variable and should be
> > handled by sword in such a way that it would point me to the Dutch names
> > as would nl_NL or just nl.
>
> It's a valid variable according to some other standard, but not IETF RFC
> 3066.  The format described in the page you cite is specifc to POSIX
> locales.  Our language codes are used in all books and on non-POSIX
> systems.
>
> I think we're in agreement that Sword should convert POSIX locales to
> IETF format and then match the most similar available locale, which is
> why I put this feature request into our database the first time you
> brought it up.  But I don't know of anyone who has had time to work on
> it since then.  If anyone has the time, patches are always welcome.
>
> --Chris
>
>
>
> _______________________________________________
> sword-devel mailing list
> sword-devel@crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel