[bt-devel] RE: UTF-8 and new module classes

Martin Gruner bt-devel@crosswire.org
Thu, 24 May 2001 11:24:51 +0200


> Eventually, I would like to get any modules with characters that conflict
> with UTF-8 (any characters in the range 0x80 to 0xFF) into UTF-8 so we can
> do away with the Encoding value also and just accept everything as being
> UTF-8.

I didn't get this. Please explain.

> I should also retract my previous statement that we can get rid of the Font
> value because it's just a better idea to have numerous smaller fonts with
> the correct range for a module than to have a single huge font able to
> display all Unicode glyphs.

I favor moving from the font= tag to an encoding= tag. This way we'd not have 
to use huge fonts, but still the flexibility to let the user choose his/her 
font. E.g. encoding=iso8859-7 would define greek text. You can then just 
display this text with a 1 Byte iso8859-7 font or map it into unicode for 
different purposes.
IMO using standards is always a good way to go.
We could implement some mapping filters in sword which map from fontspecific 
ascii encodings to the correct language specific encodings (Like a 
bstgreek2iso8859-7 filter) to also support frontends favoring the font= 
solution.

Some good links I want to recommend to you:
http://czyborra.com/
http://czyborra.com/charsets/iso8859.html
http://czyborra.com/charsets/cyrillic.html

Martin