[sword-devel] 3-letter language character codes

DM Smith dmsmith at crosswire.org
Mon Nov 9 11:32:09 MST 2009

On 11/09/2009 11:51 AM, Karl Kleinpaste wrote:
> DM Smith<dmsmith at crosswire.org>  writes:
>> ISO-639-3 is a changing set of codes.
> ...
>> These all changed on 2009-01-16.
> What is the point of "standardized" abbreviations if the "standard" is
> not fixed?  "ckw" is replaced with "cak", "tzz" with "tzo"?  For whose
> benefit is that, other than as a make-work issue for people like us?
I don't know all the history, and what I know may be a bit faulty.

There are about 7500 languages. The beginnings of the ISO-639 were in 
the Ethnologue, started in 1950. ISO-639-1 was adopted in 1988. 
ISO-639-2 was adopted in 1998 and covered about 400 languages. IS0-639-3 
was given to SIL in 2002 and the first adoption of it was published in 
2007. So only a few years ago, the list was quite small. At that time, 
some of our module had Ethnologue codes of the form x-aaa or x-yyy-aaa.

At this point ISO-639-3 encompasses all 2 and 3 letter codes. It is 
actively maintained and updates happen at least once a year.

Much of the effort to define languages resolves around literacy and 
Bible translation. It is widely held that the return of Christ is 
predicated on the gospel being preached to every tongue and there is an 
effort to get the Bible into every spoken language. Many languages have 
no alphabet. My daughter and her husband spent the summer finalizing the 
alphabets for 3 closely related languages. At this point they, and the 
team that they were on, believe that these are 3 distinct languages and 
not merely dialects of each other. As such, they would have three 
different codes and language names. If later, these were found to be 
merely dialectical different, the 3 alphabets might be merged into one 
and the 3 different codes and their names would be replaced with one name.

If you look at the reasons for retiral, many of them were 'M', that is 
merging several codes into one code.

On a similar note, the two letter codes are not stable either. Hebrew 
used to have the code 'iw' now it has the code of 'he'. Likewise for 
Indonesian, it use to have the code 'in', but now it is 'id'. Now with 
the latest CDRL, 'in' is an alias for 'id'.

These two have bitten me as Java silently transforms the current code to 
the obsolete one. 'iw', Hebrew, bit me a few years back. Indonesian, 
'in', was last week as Tonny supplied an Indonesian translation for 
JSword. We had to name the resource files with the obsolete name to get 
it to work.

In Him,

More information about the sword-devel mailing list