[bt-devel] RE: UTF-8 and new module classes

Chris Little bt-devel@crosswire.org
Thu, 24 May 2001 14:48:01 -0700


There shouldn't be a need for such a thing, as Joachim's work exhibits.  The
reason Troy did the UnicodeRTF filter was that RTF doesn't handle UTF-8
using a global encoding mechanism like HTML renderers do.  You have to
actually construct an RTF tag like "/u20229?" for character U+20229 which
required parsing the UTF-8 characters manually and determining the value for
the variable-byte character.

I'd like to note that in the ChiGU-UTF8 module, there are spaces between all
characters.  This is done to allow line breaking in HTML renderers, but in
the next version of the module (expect lots of UTF-8 modules this weekend)
the spaces will be removed since they aren't properly part of the text.

--Chris

> -----Original Message-----
> From: owner-bt-devel@crosswire.org
> [mailto:owner-bt-devel@crosswire.org]On Behalf Of Martin Gruner
> Sent: Thursday, May 24, 2001 10:52 AM
> To: bt-devel@crosswire.org
> Subject: Re: [bt-devel] RE: UTF-8 and new module classes
>
>
> Joachim,
>
> did it work without a UnicodeHTML filter?
>
> Martin
>