[sword-devel] Release-critical TODO items (updated mod2osis patch)

Jonathan Marsden jmarsden at fastmail.fm
Mon Apr 27 16:48:53 MST 2009

DM Smith wrote:

> I am. You can get the input text from www.crosswire.org/~dmsmith/kjv2006.

Aha!  Thanks, I'll try it tonight.  BTW, wouldn't putting this URL 
somewhere in the kjv.conf file be both useful and appropriate?

> Note, some of the transformations by osis2mod create a module that does 
> not contain valid OSIS. It is OSIS that SWORD requires. It pertains to 
> the preverse title markup.
> mod2osis has to undo those transformations at least in part.

Ah, and if it doesn't undo the "stuff colophons into the previous verse" 
thing, then of course the resulting output is technically not valid 
OSIS.  Which I suspect is exactly the issue I am seeing.

> That's why Greg has the comparison being a result of:
> Run mod2osis to get base text
> Run osis2mod to get a module with the base text
> Run mod2osis on that module to see it creates the same base text.

While that is an entirely reasonable and useful test, it is testing for 
"round-trip" capability, but it is *not* testing whether the output of 
mod2osis is actually OSIS.  Gregs test and mine are therefore complementary.

Overall, I'm not sure it is appropriate to be strongly pushing for the 
use of an officially defined standard (such as OSIS), and simultaneously 
to release software tools that generate "something rather like OSIS" 
which do not say anything about this "something rather like" issue in 
their documentation :)  My recommendation would be that CrossWire should 
either support the OSIS standard, or else clearly document their 
deviation from it.

Longer term, this need for strange transformations looks to me like a 
problem that stems from an inadequate or incomplete underlying book 
representation in SWORD itself?  That may be something for SWORD 2.x, 
not 1.6 :)

>> Is this just an outdated wiki page leading me astray, and if so, where
>> can I find osisCore.2.5.xsd ?

> The latest is 2.1.1. There is no such thing as 2.5.

Hmmm, then we need to find out why mod2osis says that is what its output 
validates against.  I think it is just hardcoded into the mod2osis.cpp 
source.  If there is no 2.5, then that is a (trivial to fix) mod2osis bug.

> A colophon is something that comes at the end of a book.

Indeed :)

> The input has:
> ...
> <verse>verse text</verse>
> <div type="colophon">colophon text</div>
> </chapter>

> The module only stores verses so the colophon is appended to the last 
> verse.

So this non-standard-ness of the colophon being inside the last verse is 
SWORD-created -- osis2mod-created, to be specific!  OK.  Light dawns... 

In that case, mod2osis needs to know about that SWORD-specific 
transformation done by osis2mod (and any others!), and "undo" it (or 
them), I would think.  Based on my very simple (and perhaps simplistic?) 
test, at the moment mod2osis does not seem to be doing that reverse 
transformation of colophons successfully.  And (IMO) that is a bug, 
because it means mod2osis does not generate OSIS standard output, just 
"something rather like" OSIS.

I don't yet know how easy this one will be to fix.  Probably just 
another state variable set when a colophon is encountered.  Then 
mod2osis outputs </verse>\n and then the colophon, and then the code 
that outputs the end of verse trailer tests the colophon state variable, 
and does not emit </verse>, but instead resets the "colophon" state, if 
the "colophon" state is true.  Mildly ugly, but probably not hard to 
code and test?  I'll give it a try if I have a chance tonight.


More information about the sword-devel mailing list