[sword-devel] Adding abbreviated names to the module conf file (was Re: isalnum(3) for i18n)
chrislit at crosswire.org
Wed Dec 17 22:15:04 MST 2008
DM Smith wrote:
> I would like to lobby for a separator between the language code and the
> field name. I don't much care whether it is a prefix or suffix. While I
> understand that you are suggesting that we don't have a deAbbr or
> xxAbbr, I could see that it might be added some time in the future and
> with 3-letter codes and with differences in script (e.g.
> Traditional/Simplified Chinese), a separator makes it much easier to
> code for today.
Okay. While I find the arguments in favor of localizing all string
fields in .confs entirely unconvincing, it doesn't drastically harm
anything to permit module makers to add more data to their .conf files.
So I think we can go with the suggestion of suffixing localized string
values in .confs with _ plus a locale (which would generally mean a BCP
47 value, but we may have to alter that based on the constraints on
attribute names in .confs). If we add Abbr, Author, Translator, &
Publisher fields, the complete list of localizable attributes would be:
Abbr, Author, Translator, Publisher,
About, Description, History_x.x,
Copyright, CopyrightHolder, CopyrightNotes, CopyrightContactName,
CopyrightContactNotes, CopyrightContactAddress, ShortPromo,
In the absence of a localized form, the un-suffixed version will always
be default. In general, the title-like attributes will be according to
the actual title of the book, generally in the language of the text.
Names of people/organizations will generally be language-neutral. The
rest will generally be language-neutral or in English.
That said, nothing will *work* unless someone writes the code to take
advantage of it.
> I like that the default should be in the language of the module. I'm
> assuming as well in the encoding of the module (e.g. UTF-8 for UTF-8
Yes, I mentioned the latter point in an aside. We already have the
standard of interpreting fields in a module with Encoding=UTF-8 as
UTF-8--otherwise they should be interpreted as cp1252. No need to change
> WRT the length of Abbr, I'd like to see it be much shorter than 16 or
> that 16 be the upper limit w/ a much smaller number being the
> recommended maximum, say 6?, with the knowledge that anything longer
> than 6 (or whatever is the recommended max) may be truncated by some
> frontends (e.g. MacSword and BibleDesktop have dropdowns for a parallel
> view which have a severe limit of 4. I imagine that small devices, such
> as phones and PDAs would also have a real estate problem.)
So, for reference, the width of 16 characters would be:
Six would be:
I suspected there would be disagreement with my suggested number, but I
had assumed that it would seem too low. So... some of my reasoning:
Many Bibles will include a year, which eats up 4 characters in itself.
Bibles with standard abbreviations aren't a big issue (WEB, NIV, NASB,
NRSV, etc.) but many others incorporate a translator/place/organization
name--which can be longish (Elberfelder, Webster, Grünewald, Rotherham,
Delitzsch, Tischendorf, Cornilescu, etc.)
So, we could make the limit lower, but I worry that we would limit the
meaningfulness of these strings. Maybe we could cut it down to 12?:
I18n isn't much of a concern here. Western European languages have the
highest sign to phoneme ratio that I can think of. And non-alphabetic
scripts will generally be far more economical in terms of
codepoints--though this will often be lost due to physically wider
More information about the sword-devel