[sword-devel] Why is OSIS preferred? Was Re: usfm2osis.pl

Karl Kleinpaste karl at kleinpaste.org
Tue Jul 1 06:16:08 MST 2008

"Jonathan Morgan" <jonmmorgan at gmail.com> writes:
> ThML is also still (I think) used by the greatest percentage of our
> modules (though that may be changed in the future).
> Will GBF continue to be supported?  I seem to remember that Chris
> reported lack of GBF support as a missing feature in BPBible, despite
> the fact that I'm sure that I have heard statements suggesting GBF is
> very strongly deprecated.  How many modules are still GBF?

A couple shell commands will give useful summaries.  Refresh main and
beta repos in your mod.mgr, then peek in ~/.sword/InstallMgr/*/mods.d.

for i in plain gbf thml osis ; do
    echo $i `grep -i ^sourcetype=$i * | wc -l`

Main:                   Beta:
plain   2               plain   1
gbf     49              gbf     0
thml    163             thml    6
osis    23              osis    93

The reason for the new increase in beta OSIS modules is due to the
arrival of 41 new WBT texts 2 days ago -- almost half the beta repo in
one shot.

Significantly, a couple of really important modules (LXX, for one) are
still distributed as GBF.

(Aside: All these new WBT texts appear in GS as "unknown" language.  Is
there a mapping somewhere handy, from "ngu", "tzz", et al to something
readable by mere mortals?  I'm happy to update GS to accommodate more
language definitions but I need a source for them.)

>> * OSIS is a growing, maturing standard, addressing the short-comings
>> of other popular formats.

> And adding some of its own (its complexity comes to mind here, though
> possibly that is intrinsic given what it is trying to cover).

When I first wanted to start generating modules on my own, I didn't have
enough context to know what was intended or preferred, and by plain
count (with variations on the grep construct above) I found that GBF was
far and away the leading format for Bible texts, so I generated that.
Then I learned from reading somewhere, now long forgotten, that GBF was
on the way out, and that made ThML a really good choice, especially
considering its huge majority overall and (then) substantial majority in
Bible texts over OSIS, still not having found any particular source of
info for what was preferred.

>From then to now, all I generate is ThML.  I encountered the OSIS
preference by dumb luck somewhere along the way -- this was closely
related to finding the forums by dumb luck, because there was no linkage
to them at www.crosswire.org -- and I debated changing my scripting
habits to generate OSIS.  But its complexity alongside my usual module
generation scheme stopped me cold.  This led to similar thoughts along
this line:

> Separating presentation from content is a nice idea, but I'm not
> convinced that it is good in all cases.  What happens with OSIS when a
> Bible publisher wants to insist that certain constructs in their Bible
> are formatted in certain ways?

This is especially true given the wildly different habits embodied in
each of the UIs.  GS does not format like BT; BT does not format like
MS; nothing formats like the Windows UI, mostly because it uses the RTF
filters.  Numbered footnotes/xrefs, or unlabeled superscripts?  Inline
footnote content?  Verse-per-line display regardless of paragraph
markup?  Header above Gen 1:1 to identify the text?  Underline, italic,
bold markup in modules, when RTF provides no underline at all?
Permissive markup pass-through, or limiting?  Image support in all
module types, or just some; and what formats?

The deeper problem for me is that, for the vast majority of mere mortals
out there, what is wanted in a Bible study tool is readable texts.  It's
certainly true that OSIS provides more and often different structure
that can be de- and re-constructed in more ways.  But what Joe Random
wants is readability and search capability.  It's not clear to me that
either of these qualities is better served by OSIS than ThML.

More information about the sword-devel mailing list