[sword-devel] Detecting Problem Characters

Michael Hart just_mike_y at yahoo.com
Fri Sep 23 10:20:53 MST 2011

I've got a couple modules-in-making both of which I'm working on quote 
marks that aren't displaying at all or are displaying block "mystery" 
characters.  I'm spending time trying to separate apostrophes from 
single quotes on both modules with the hope I can preserve or achieve 
the ability to use OSIS <Q> tags....


In both modules, at some point I've lost control of a few characters and 
now ms excel or openoffice calc, or jEdit now can't see all the end of 
line characters. That is, when I try to open the file VPL, it almost but 
not quite works.  Some verses are grouped together in either spreadsheet 
while jedit sees them as properly separated.

Recently or not so recently I saw a comment in some post describing a 
way  or a program with summarizes all 'non-ascii' or 'out of this 
encoding' characters that appear in a file.  I've spent time searching 
for this post but cannot locate it or any information about this step on 
the module creation wiki.

Can someone enlighten me (again) as to the best method to find offending 
characters and deal with them?

Thanks in advance,


PS.  Modules in progress are based on these documents:

1. Holy New Covenant (public domain on publication in 2004.)

The "palm doc" file actually opens as a ms word 97 or 2003 file.)  It is 
my intention to get this into sword to evaluate it as to it's 
readability and usability.  From my cursory review is is a fairly 
faithful treatment of scripture. Galilee Translation Team mentioned 
appears to be affiliated with The Church of Christ in some way.

2. The Riverside New Testament (published 1923 and copyright renewed 
(1948?) according to Google, but even if still copyrighted should be 
distributable within the next decade... If I have my facts straight).


Came to me as a 'zefania' xml file.  Note that this file is now (after I 
started working on this last year) already available in OSIS format at:


so this is really more of an exercise in 'what am I doing wrong' for me.

