[sword-devel] usfm2osis.pl

David Haslam dfhmch at googlemail.com
Tue Jul 10 02:46:58 MST 2012

Thanks for all the clarifications. I have no disagreements with any of the
points raised since yesterday.

And yes - we encounter all sorts of unexpected characters when examining
received SFM files.

The set I was analyzing yesterday was replete with U+202F Narrow no-break
space characters.

One of these was misused as the delimiter within a verse range tag, where
there should have been a minus.

usfm2osis.pl may or may not catch all these kinds of errors.

This is also why it's generally a very good idea to use Dirk's 
http://gbcpreprocessor.codeplex.com/ GoBibleCreatorUSFMPreprocessor  utility
to do a few checks first, (even although in theory it might be seen as
off-topic for SWORD).

Sometimes this will catch items that, once fixed, will make the subsequent
processing steps less of a hassle for making a module.

NB. If it crashes while looking for "versification issues", then use trial
and error to isolate which SFM file[s] contain the cause of the crash.
Report discoveries like this back to the issues list in codeplex.


View this message in context: http://sword-dev.350566.n4.nabble.com/usfm2osis-pl-tp4650500p4650514.html
Sent from the SWORD Dev mailing list archive at Nabble.com.

More information about the sword-devel mailing list