[sword-devel] language/locale codes

DM Smith dmsmith at crosswire.org
Thu Nov 12 04:00:25 MST 2009

On Nov 11, 2009, at 11:05 PM, Chris Little wrote:

> Other things to consider:
> 1) If a module in a new language is released, should that module be tagged as beta or tagged as requiring the next release of Sword until the next release of Sword comes?
> E.g., let's suppose Sword 1.6.1 is the current version and includes a unified facility for getting language names from language codes, but ships with only those codes that are in use at the time 1.6.1 is released. (Also suppose there is no facility for updating the code lists.) If we release a new Bible in Mongolian, is it more desirable to hold off release of the module (put it in beta) until 1.6.2 is released or to release the module but have it be listed by its code (or as "Undetermined")?

The problem really revolves around whether we have a default list that is pruned to what currently is available (100 or so entries) or is complete (7500+ entries). If it is pruned, then it needs frequent releases. If it is complete, then it's "minority" languages that might be affected. That is, languages that the majority users of SWORD will not know and that the majority of the speakers of that language that have a computer and use a SWORD app are missionaries.

The drawback to having a complete list is the performance hit it entails.

I think there should be a hierarchical approach to looking up language names. The master list should be complete. The locale specific lists should be pruned.

If it is a pruned list then the problem is how the list and new modules are released.
I think there are several release schedules involved here:
1) The release of the SWORD engine.
2) The release of frontends. 
3) The availability of a frontend for a particular version of an OS (i.e. Linux distributions).
4) Users upgrading a frontend.

I think that module availability should not be tied/synchronized to these directly. The MinimumVersion field in the conf *should* be sufficient.

I think that resources (e.g. v11n, locales, language lists) should be versioned/released independently from the engine, insofar as their structure does not change and these should be read by the SWORD engine.

> 2) Should ancient, historic, & extinct languages ever be listed in their own language? It makes complete sense to identify cy (Welsh) as Cymraeg or et (Estonian) as Eesti because the users of content in those languages are most likely familiar with those names.
> Users of ancient languages like (ancient) Greek & Hebrew, Latin, Coptic, Gothic, Sumerian, Akkadian, & Hittite (those that I can think of that we have content in or soon shall) may not know the native names of those languages.
> In the last year, I've studied Hittite, Luwian, Tocharian, Oscan, and Umbrian, and I'd say I'm at a fairly average level in each of them, among those who have studied them. I know their names in English and German--because those are typical languages of works discussing these languages. But I don't have the least clue what any of these languages were called by their speakers. For languages like Gothic, Oscan, and Umbrian, I wouldn't be surprised to hear of competent users of these languages who cannot even read them in their native script, since most work is done with transliterated transcriptions.
> So does it make sense to never use the native localized versions of languages marked as ancient, historic, or extinct (by ISO 639-3)? The only issue will be he (Hebrew), used for Biblical Hebrew, but we should really update WLC and anything else referencing Biblical Hebrew to use the ISO 639-3 code hbo (Ancient Hebrew) instead.

I've struggled with this one from a user's perspective. Peter has from a translators perspective (I feel his pain).

I would suggest that the master list be in English and perhaps also in French (only because French is available). I think a separate, pruned list of native languages (localized-mul.txt).

Then the frontend would know, for a given code:
1) Default -- That there is a default version in English (and/or French) 
2) Native --  If there is a native version.
3) Locale -- If there is a localized version.

The application could show it in any number of ways: (let --> indicate fallback)
a) Show native language lists 
	 Native --> Locale --> Default
b) Show localized language lists defaulting to Native 
	 Locale --> Native --> Default
c) Show localized and native 
	 Locale (Native) --> Locale --> Default (Native) --> Default

An issue in showing native for ancient, historic and extinct languages is the need for a font that displays all native language names well. It'd be really ugly for boxes to show up.

In Him,

More information about the sword-devel mailing list