[sword-devel] Improvements to osis2mod to handle XML comments and <header> correctly
J.Zaitseff at zap.org.au
Thu Feb 4 20:34:20 MST 2010
Dear SWORD developers,
Firstly, thanks for developing the SWORD library! I have been using
this library, in conjunction with the BibleTime front-end, for many
I have recently started to develop some OSIS documents of my own.
In doing so, I found that the XML parser in osis2mod is somewhat
fragile---something that you are, no doubt, aware of.
In particular, osis2mod does not handle XML comments at all, nor
does it correctly parse the <header> element. Being able to handle
XML comments is, I think, quite important. For example, I like to
document the SVN revision ID, for example, in an XML comment. I
also like to be able to comment out sections of the XML file when
testing the osis2mod parser.
Furthermore, the osis2mod XML parser looks for the first <div> in
the document, no matter where that occurs. In particular, if the
OSIS document includes a <revisionDesc> tag in the header, it will
have <p> tags as well---which will be translated by transformBSP()
into <div> tags---and get used as the starting point for the
For this reason, I have generated a quick patch that will solve
these particular problems. Could you please apply it to the SVN
head for utilities/osis2mod.cpp. Comments are handled similar to
spaces: they are skipped. And handleToken() now looks for the first
<div> after the </revision> end tag.
In general, I think that (perhaps eventually) the proper way to
parse XML is to use a library like libxml---which is designed
specifically for this purpose.
John Zaitseff ,--_|\ The ZAP Group
Phone: +61 2 9643 7737 / \ Sydney, Australia
E-mail: J.Zaitseff at zap.org.au \_,--._* http://www.zap.org.au/
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 4904 bytes
Desc: not available
More information about the sword-devel