[sword-devel] StripText() result not converted to UTF-8?

Joachim Ansorg nospam+sword-devel at joachim-ansorg.de
Sun Feb 18 09:40:12 MST 2007

replying to myself.

I've been wrong in some of my assumptions.

JFB is ThML. It contains the entity Æ

StripText() calls the filter ThMLPlain which converts the Æ into 0xC9, 
which is the corresponding cp1252 character code.

I thought that StripText() would remove all markup and return text in the 
encoding given to EncodingFilterMgr.

My question:
Is that right or wrong?

Some help would be wonderful,

> Hi,
> I'm just debugging a bug in BibleTime.
> Our SWMgr is created to output utf8.
> The module JFB contains the entitiy Æ .
> When I call StripText() the entitity is converted to the corresponding
> character in the cp1252 charset, i.e. char with the value 0xC9.
> I thought that the latin2utf8 filter would convert this plain text to utf8
> because I told SWMgr to do this for me.
> Is there a way to set the output encoding for StripText() to be different
> than the module's encoding?
> Thanks a lot,
> Joachim

<>< Re: deemed

More information about the sword-devel mailing list