[sword-devel] Improvements to osis2mod to handle XML comments and <header> correctly

DM Smith dmsmith at crosswire.org
Sat Mar 24 16:11:34 MST 2012

I'm sorry that the request fell off my todo list. Thank you for your patch.

A change to swbuf didn't allow for the patch to apply cleanly. It was a simple fix.

I've split the patch into two parts so that each commit does one thing.

Revision 2692 allows for <div> elements in <header> elements to be ignored. Previously it assumed it was part of the module.

Revision 2693 allows for comments to be present in the input osis document but not in the module. This part also outputs a warning if the tag is not proper, specifically checking for the character(s) after a < to be an alpha, a / followed by an alpha or a ? (processor instruction indicator).

I'm not sure that it properly handles processor instructions (i.e. stripping them out.), but it didn't handle them properly before.

There are few new ERRORS and WARNINGS that are enabled with the -d 512 flag. I've documented these on the wiki.

In Him,
	DM Smith

On Mar 21, 2012, at 11:43 AM, John Zaitseff wrote:

> Dear SWORD developers,
> I wrote to this list about two years ago (4th February, 2010, to be precise) 
> with a couple of suggestions and a patch for the SWORD library.
> Unfortunately, the patch I suggested was not applied (an oversight, I'm 
> sure), and I've been way too busy with other things to chase it up... until 
> now.
> I wrote:
>> Firstly, thanks for developing the SWORD library!  I have been using
>> this library, in conjunction with the BibleTime front-end, for many
>> years.
>> I have recently started to develop some OSIS documents of my own.
>> In doing so, I found that the XML parser in osis2mod is somewhat
>> fragile---something that you are, no doubt, aware of.
>> In particular, osis2mod does not handle XML comments at all, nor
>> does it correctly parse the <header> element.  Being able to handle
>> XML comments is, I think, quite important---I like to document the
>> SVN revision ID, for example, in an XML comment.
>> Furthermore, the osis2mod XML parser looks for the first <div> in
>> the document, no matter where that occurs.  In particular, if the
>> OSIS document includes a <revisionDesc> tag in the header, it will
>> have <p> tags as well---which will be translated by transformBSP()
>> into <div> tags---and get used as the starting point for the
>> document!
>> For this reason, I have generated a quick patch that will solve
>> these particular problems.  Could you please apply it to the SVN
>> head for utilities/osis2mod.cpp.  Comments are handled similar to
>> spaces: they are skipped.  And handleToken() now looks for the first
>> <div> after the </revision> end tag.
> DM Smith replied with:
>> Sorry for the late reply. This patch looks good and we'll commit it
>> shortly.
> I am attaching the patch to this e-mail, as I find that the problem still 
> exists in the library.  Could you please apply it?  Thanks!
> Yours in Christ,
> John Zaitseff
> -- 
> John Zaitseff                    ,--_|\    The ZAP Group
> Phone:  +61 2 9643 7737         /      \   Sydney, Australia
> E-mail: J.Zaitseff at zap.org.au   \_,--._*   http://www.zap.org.au/
>                                      v
> <libsword-svn-JNZ-2.diff>_______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page

More information about the sword-devel mailing list