[sword-devel] character encoding conversion

Chris Little sword-devel@crosswire.org
Tue, 12 Jun 2001 10:44:40 -0700

Yeah, I had the same thought of using a hash table, but decided against
it because I had erroneously thought it would be larger in memory than a
giant switch.  (Don't ask me why, it was late.)  I'll try
re-implementing as an STL map since that's what I'm familiar with.
Other ideas are welcome still.

I looked at the various Unicode libraries available and none of those I
saw were adequate or were too large to include for our minimal needs.
IBM's ICU looked very nice, but it's large and I don't really want to
worry about adding IBM Public License materials to the project.  If we
write it ourselves, we can license under our own terms.  We can also be
assured that our code will do exactly what WE need it to do, rather than
perhaps a more general or less efficient function.  I rewrote our Roman
numeral functions for the same reasons (license & specificity to our

Besides that, we're not going to maintain the tables ourselves.  We'll
use the tables from Unicode, Inc. which they state are very stable.
Once the basic mechanism is set up, doing classes for all the
conversions they support will be a piece of cake.


> -----Original Message-----
> From: owner-sword-devel@crosswire.org [mailto:owner-sword-
> devel@crosswire.org] On Behalf Of David Burry
> Sent: Tuesday, June 12, 2001 9:36 AM
> To: sword-devel@crosswire.org; SWORD Devel List
> Subject: Re: [sword-devel] character encoding conversion
> Most higher level languages have some sort of hash or associative
> built in, perhaps there are a few libraries somewhere for C to do this
> even
> more efficiently since all keys and values are the same length (two
> from UCS16 to SJIS?  I assume a simple calculation and 14k array will
> from SJIS to UCS16...  In addition, aren't there already lots of
> conversion libraries out there we could link against?  There are
> dozens of conversions to/from Unicode I don't know if we should be
> maintaining all the tables ourselves...
> Dave