[sword-devel] AmTract Encoding

Thu Oct 25 16:05:10 MST 2007

Eeli Kaikkonen wrote:
> I protest against this very strongly. All I get with Sword library
> encoding system is problems after problems.

That's rather an overstatement--or else simply unappreciative of the 
work we've gone to to simplify encoding matters. We formerly supported 
miscellaneous national and font-specific encodings, but many years ago 
we decided to support only two encodings and converted the relatively 
few modules using other encodings to utf-8.

Everything else was left in its existing encoding, which was ISO-8859-1 
in most cases and Codepage 1252 in some others. Since (in terms of 
printed characters) Codepage 1252 is a superset of ISO-8859-1, which is 
itself a superset of ISO 8859-1, handling everything as Codepage 1252 
works fine.

 > Why use "latin1" which is
> not latin1?

I think you're being a little too pedantic here, so I'm going to indulge 
your pedantry with some of my own.

"Latin-1" is not "latin1".
"Latin-1" refers to ISO/IEC 8859-1.
"latin1" refers to ISO-8859-1.

ISO-8859-1 and Windows CP1252 are BOTH extensions of ISO/IEC 8859-1. 
Thus, if ISO-8859-1 is Latin-1, so is Windows CP1252--regardless of what 
you may have thought before, read elsewhere, or formed on the basis of a 
belief that all things from Microsoft are inherently bad.

(ISO-8859-1 adds control codes to ISO 8859-1. Windows CP1252 adds 
control codes and a few printed characters to ISO 8859-1.)

 > Why not use real latin1? Why use latin1 at all? It is easy
> enough nowadays to change all modules to use utf8.  Real Latin1 is well
> supported in programming libraries, I understand if it is used, but I
> don't (again) understand some Microsoft extension which are not
> universal.

Your suggestions are good, but would have been more useful about 10 
years ago. Right now, we support CP1252 for backwards compatibility and 
any suggestions about whether we should or should not use it or in what 
way(s) we should use it are completely unnecessary.

--Chris