[sword-devel] StripText() result not converted to UTF-8?

Troy A. Griffitts scribe at crosswire.org
Sun Feb 18 12:20:37 MST 2007

	I believe the filter is wrong.  It should return the UTF-8 value.  This 
is a bug.  Anyone want to look through the unicode code chart and recode 
all these values?


	Sorry for the bug Joachim.


Joachim Ansorg wrote:
> Hi,
> replying to myself.
> I've been wrong in some of my assumptions.
> JFB is ThML. It contains the entity Æ
> StripText() calls the filter ThMLPlain which converts the Æ into 0xC9, 
> which is the corresponding cp1252 character code.
> I thought that StripText() would remove all markup and return text in the 
> encoding given to EncodingFilterMgr.
> My question:
> Is that right or wrong?
> Some help would be wonderful,
> Joachim
>> Hi,
>> I'm just debugging a bug in BibleTime.
>> Our SWMgr is created to output utf8.
>> The module JFB contains the entitiy Æ .
>> When I call StripText() the entitity is converted to the corresponding
>> character in the cp1252 charset, i.e. char with the value 0xC9.
>> I thought that the latin2utf8 filter would convert this plain text to utf8
>> because I told SWMgr to do this for me.
>> Is there a way to set the output encoding for StripText() to be different
>> than the module's encoding?
>> Thanks a lot,
>> Joachim

More information about the sword-devel mailing list