[sword-devel] Calvin's commentaries, and ThML to OSIS conversion
L.Plant.98 at cantab.net
Wed Jul 11 08:45:08 MST 2007
I've been trying to create a Sword module containing all of Calvin's
commentaries, using the ThML sources from CCEL. I've made good
progress, and some of my work should be reusable for other projects.
I've read that Sword is trying to move away from ThML to OSIS, so the
module will be an OSIS module. I've been careful to remove any manual
editing, so that everything can be generated automatically from the
ThML (in case of any updates to the the sources).
The main steps are:
1) Make some corrections to the ThML (use of scripCom tag in
particular) - DONE (implemented using Python script)
2) Combine all the ThML files into a single ThML source - DONE (Python)
3) Convert to OSIS. I've done this using XSLT, and I'm intending to
release my thml2osis.xslt as a separate project. It is about 90% done
(at least in terms of translating Calvin's Commentaries), and has tests
and so on. It should be a useful and portable utility for converting
other CCEL sources. (The test suite is currently executed using unix
tools, which would be a problem for Windows developers.)
4) Import as a Sword module. The problem here is that osis2mod is
basically for importing Bibles only -- it expects you to use <div
type="book">, <div type="chapter"> (or <chapter>) and <verse>. These
are not really natural or semantic ways to mark up a commentary. A more
obvious and natural way to do it is like this:
<div type="section" annotateType="commentary"
I do actually have a Python script which converts this markup to the
that expected by osis2mod, but it uses DOM, and memory usage for the 45
Mb input OSIS file is prohibitive. Anyway, I think creating a version
of osis2mod for commentaries is the better way to handle this (I did
find an old message in sword-devel saying that an importer would be
written if OSIS commentaries were provided).
I would write the osis2mod modifications myself, but I've looked at
osis2mod and the main function that needs modifying, handleToken(), is
a bit of a beast -- about 400 lines, about 20 local variables etc. I'm
not confident enough with Sword to be able to refactor it properly, and
I don't want to do large amounts of copy and paste.
So, is someone willing to help out with this final step?
Also, is there a place where I should release this stuff? I think Sword
needs a 'sword contrib' project, or at least a section on the wiki that
details how to get these different things. I get the impression that
the main Sword developers have various scripts to help them, and a
central repository for these kinds of tools would be very helpful. A
Bazaar repository would probably be ideal -- I could put up a
publically readable one for my stuff.
Sometimes I wonder if men and women really suit each other. Perhaps
they should live next door and just visit now and then. (Katherine
Luke Plant || http://lukeplant.me.uk/
More information about the sword-devel