[sword-svn] r125 - trunk/modules/calvinscommentaries

lukeplant at www.crosswire.org lukeplant at www.crosswire.org
Thu Nov 29 16:41:08 MST 2007


Author: lukeplant
Date: 2007-11-29 16:41:08 -0700 (Thu, 29 Nov 2007)
New Revision: 125

Added:
   trunk/modules/calvinscommentaries/bundle_and_install.sh
Modified:
   trunk/modules/calvinscommentaries/README
   trunk/modules/calvinscommentaries/combine_calcom.py
Log:
Added a build script, and completed steps to be able to make
Calvin's Commentaries module


Modified: trunk/modules/calvinscommentaries/README
===================================================================
--- trunk/modules/calvinscommentaries/README	2007-11-29 23:32:00 UTC (rev 124)
+++ trunk/modules/calvinscommentaries/README	2007-11-29 23:41:08 UTC (rev 125)
@@ -17,29 +17,47 @@
 
 Make the module
 ---------------
+First edit 'bundle_and_install', setting directories as chosen.  If you
+don't want the module installed at the end, comment out the last line which
+unzips it into place.
 
-$ ./combine_calcom.py calcom_sources/calcom??.xml
-(output stored in calvinscommentaries.thml)
-$ xsltproc --novalid path/to/thml2osis.xslt calvinscommentaries.thml > calvinscommentaries.osis
+TODO:
+- get osis2mod to handle commentaries properly (instead of requiring
+  them to be marked up Bibles as currently.  Once this is done, most of
+  the ugliness in 'bundle_and_install' will go away, and it gets a whole
+  lot simpler.
+- Check the OSIS actually validates
 
-TODO
-- convert OSIS commentary to Sword module
 
-Explanation of these steps
---------------------------
+Explanation of steps
+--------------------
 1) 'Correct' some of the ThML files.  In particular, change the
    'scripCom' tags so that they enclose the text they refer to,
-   rather than just come at the beginning of it.
+   rather than just coming at the beginning of it.
    This is done as part of combine_calcom.py
 
 2) Combine all the ThML files into one big one, and at the same time:
    - modify the header information, using one of the calcom??.xml files
      as a template
    - make any corrections necessary to the ThML for the new context
+   
+   This is the second task of combine_calcom.py.
 
    Output: calvinscommentaries.thml
 
 3) Convert to OSIS, using thml2osis.xslt
+  
+   Output: calvinscommentaries.osis
 
-4) TODO - convert to Sword module.  The current osis2mod utility expects
-   commentaries to be marked up like Bibles.
+4) Convert to formant required by osis2mod.  This uses
+  'genbookOsis2Commentary.py' script.  Since this script is DOM based,
+   it uses up too much memory if all of calvinscommentaries.osis is loaded.
+   To get round this, the OSIS file is split into lots of bits (using
+   markers inserted earlier), then run through genbookOsis2Commentary.
+
+   Also genbookOsis2Commentary gets rid of some other bits of
+   'non-commentary' text that otherwise ends up in the module, and probably
+   isn't wanted.
+
+5) Run osis2mod, create the zip file etc.
+

Added: trunk/modules/calvinscommentaries/bundle_and_install.sh
===================================================================
--- trunk/modules/calvinscommentaries/bundle_and_install.sh	                        (rev 0)
+++ trunk/modules/calvinscommentaries/bundle_and_install.sh	2007-11-29 23:41:08 UTC (rev 125)
@@ -0,0 +1,132 @@
+#!/bin/bash
+
+
+echo "Please edit this file first."
+exit 1
+## Must modify these:
+SWORDTOOLS="$HOME/devel/sword-tools"
+CALCOMSOURCES="$HOME/christian/books/John Calvin/Commentaries/calcom_sources"
+
+
+## Leave these to build in subdir 'build'
+BUILDDIR="`pwd`/build"
+OSIS2MODOUTPUT="$BUILDDIR/modules/comments/zcom/calvinscommentaries"
+CONFDIR="$BUILDDIR/mods.d"
+THISDIR=`pwd`
+
+##############################################
+
+
+which csplit > /dev/null || { echo "Cannot find required tool 'csplit'.  Exiting."; exit 1;}
+which replace > /dev/null || { echo "Cannot find required tool 'replace'.  Exiting."; exit 1;}
+
+
+mkdir -p $BUILDDIR
+mkdir -p $OSIS2MODOUTPUT
+mkdir -p $CONFDIR
+
+
+echo "Running combine_calcom.py..."
+./combine_calcom.py "$CALCOMSOURCES"/calcom??.xml > "$BUILDDIR/calvinscommentaries.thml" || exit 1
+
+echo "Converting to OSIS..."
+xsltproc --novalid  "$SWORDTOOLS/thml2osis/xslt/thml2osis.xslt" "$BUILDDIR/calvinscommentaries.thml" > "$BUILDDIR/calvinscommentaries.osis" || exit 1
+
+
+cd "$BUILDDIR"
+
+
+##############################################################################
+# Splitting
+# We currently have to use genbookOsis2Commentary (since
+# osis2mod doesn't accept format unless it is marked up like a Bible),
+# genbookOsis2Commentary is a quick hack, and doesn't work well
+# with big files, since it is DOM based.  So we split the file
+# into lots of small ones, using markers inserted before.
+# Then recombine again.  This is hacky, should go away once
+# osis2mod is fixed.
+
+# Split
+echo "Splitting..."
+
+rm part*
+
+COUNT=$(csplit -f 'part' -b '%03d' calvinscommentaries.osis "/combine_calcom.py START/" '{*}' | nl | tail -n 1 | cut -c 1-7 )
+
+# $COUNT now contains the number of parts we split into
+
+FIRSTFILE="part000"
+FIRSTFILEALT="firstpart"
+LASTFILE="part`printf '%03d' $((COUNT-1))`"
+mv $FIRSTFILE $FIRSTFILEALT
+
+# $LASTFILE is special -- it will have trailing stuff
+TMP=`mktemp`
+replace '</osis>' '' '</osisText>' '' < $LASTFILE > $TMP || exit 1
+mv $TMP $LASTFILE
+
+
+# Fix individual files
+for F in part*;
+do
+    # prepend and append some stuff
+    TMP=`mktemp`
+    echo '<?xml version="1.0" encoding="UTF-8"?>' > $TMP
+    echo '<osis>' >> $TMP
+    echo '<osisText>' >> $TMP
+    cat $F >> $TMP
+    echo '</osisText>' >> $TMP
+    echo '</osis>' >> $TMP
+    mv $TMP $F
+
+    echo "re-versifying $F ..."
+    "$SWORDTOOLS/python/swordutils/osis/genbookOsis2Commentary.py" $F > "$F.versified" || exit 1
+
+    # Now strip stuff we added 
+    TMP2=`mktemp`
+    cat "$F.versified" | egrep -v 'xml version' | replace '<osis>' '' '<osisText>' '' '</osis>' '' '</osisText>' '' > $TMP2
+    mv $TMP2 "$F.versified"
+
+done
+
+# Now combine again
+COMBINED="calvinscommentaries.versified.osis"
+# Use this cleared up XML instead of the uncleaned stuff in $FIRSTFILEALT
+echo '<?xml version="1.0" encoding="UTF-8"?>' > $COMBINED
+echo '<osis xmlns="http://www.bibletechnologies.net/2003/OSIS/namespace" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.bibletechnologies.net/2003/OSIS/namespace http://www.bibletechnologies.net/osisCore.2.1.1.xsd">' >> $COMBINED
+echo '<osisText osisRefWork="bible" canonical="true" osisIDWork="calvincommentaries" xml:lang="en">' >> $COMBINED
+
+for F in part*.versified;
+do
+    cat $F >> $COMBINED
+done
+
+echo '</osisText>' >> $COMBINED
+echo '</osis>' >> $COMBINED
+
+#######################################################################
+
+# clean out old stuff
+rm "$OSIS2MODOUTPUT/*"
+
+# xml2gbs 
+#xml2gbs -fO calvinscommentaries.osis CalvinsCommentaries
+#mv CalvinsCommentaries.{bdt,dat,idx} modules/comments/zcom/calvinscommentaries/
+
+# osis2mod
+echo "Running osis2mod..."
+osis2mod "$OSIS2MODOUTPUT" "$BUILDDIR/$COMBINED" 0 2 3 || exit 1
+
+
+echo "Zipping..."
+cp "$THISDIR/calvinscommentaries.conf" "$CONFDIR"
+
+
+cd "$BUILDDIR"
+
+zip -r CalvinsCommentaries.zip mods.d/ modules/
+
+echo "Installing..."
+## Install
+unzip -o -d $HOME/.sword CalvinsCommentaries.zip
+


Property changes on: trunk/modules/calvinscommentaries/bundle_and_install.sh
___________________________________________________________________
Name: svn:executable
   + *

Modified: trunk/modules/calvinscommentaries/combine_calcom.py
===================================================================
--- trunk/modules/calvinscommentaries/combine_calcom.py	2007-11-29 23:32:00 UTC (rev 124)
+++ trunk/modules/calvinscommentaries/combine_calcom.py	2007-11-29 23:41:08 UTC (rev 125)
@@ -22,7 +22,10 @@
 
 now = datetime.now() # for general timestamping purposes
 
+MAGIC_SEPARATOR_START = "%%% combine_calcom.py START %%%"
+MAGIC_SEPARATOR_END = "%%% combine_calcom.py END %%%"
 
+
 def do_head_replacements(doc):
     
     corrections = {
@@ -45,6 +48,14 @@
     # Correct <scripCom>
     rootNode = utils.getRoot(doc)
     thml.expandScripComNodes(rootNode)
+    # Add a comment that we are going to use later...
+    body = utils.getNodesFromXPath(doc, '//ThML.body')[0]
+    body.childNodes.insert(0, doc.createComment(MAGIC_SEPARATOR_START))
+    body.childNodes.insert(1, doc.createTextNode("\n"))
+    body.appendChild(doc.createComment(MAGIC_SEPARATOR_END))
+    body.appendChild(doc.createTextNode("\n"))
+    
+
     # Other corrections
     corrections = {
         # id attributes can now contain duplicates due to combination
@@ -64,9 +75,7 @@
     # templatexml.writexml iterates over them
     mainBody.childNodes = LazyNodes(templatexml, allfiles, do_body_corrections, '//ThML.body')
 
-    fh = open('calvinscommentaries.thml', 'wb')
-    utils.writexml(templatexml, fh)
-    fh.close()
+    utils.writexml(templatexml, sys.stdout)
 
 def main(filenames):
     combine(filenames[0], filenames)




More information about the sword-cvs mailing list