[sword-devel] 3-letter language character codes
dmsmith at crosswire.org
Mon Nov 9 11:32:09 MST 2009
On 11/09/2009 11:51 AM, Karl Kleinpaste wrote:
> DM Smith<dmsmith at crosswire.org> writes:
>> ISO-639-3 is a changing set of codes.
>> These all changed on 2009-01-16.
> What is the point of "standardized" abbreviations if the "standard" is
> not fixed? "ckw" is replaced with "cak", "tzz" with "tzo"? For whose
> benefit is that, other than as a make-work issue for people like us?
I don't know all the history, and what I know may be a bit faulty.
There are about 7500 languages. The beginnings of the ISO-639 were in
the Ethnologue, started in 1950. ISO-639-1 was adopted in 1988.
ISO-639-2 was adopted in 1998 and covered about 400 languages. IS0-639-3
was given to SIL in 2002 and the first adoption of it was published in
2007. So only a few years ago, the list was quite small. At that time,
some of our module had Ethnologue codes of the form x-aaa or x-yyy-aaa.
At this point ISO-639-3 encompasses all 2 and 3 letter codes. It is
actively maintained and updates happen at least once a year.
Much of the effort to define languages resolves around literacy and
Bible translation. It is widely held that the return of Christ is
predicated on the gospel being preached to every tongue and there is an
effort to get the Bible into every spoken language. Many languages have
no alphabet. My daughter and her husband spent the summer finalizing the
alphabets for 3 closely related languages. At this point they, and the
team that they were on, believe that these are 3 distinct languages and
not merely dialects of each other. As such, they would have three
different codes and language names. If later, these were found to be
merely dialectical different, the 3 alphabets might be merged into one
and the 3 different codes and their names would be replaced with one name.
If you look at the reasons for retiral, many of them were 'M', that is
merging several codes into one code.
On a similar note, the two letter codes are not stable either. Hebrew
used to have the code 'iw' now it has the code of 'he'. Likewise for
Indonesian, it use to have the code 'in', but now it is 'id'. Now with
the latest CDRL, 'in' is an alias for 'id'.
These two have bitten me as Java silently transforms the current code to
the obsolete one. 'iw', Hebrew, bit me a few years back. Indonesian,
'in', was last week as Tonny supplied an Indonesian translation for
JSword. We had to name the resource files with the obsolete name to get
it to work.
More information about the sword-devel