[sword-devel] Unusual markup in the Treasury of David

Greg Hellings greg.hellings at gmail.com
Thu Jan 27 07:28:45 MST 2011


On Thu, Jan 27, 2011 at 7:57 AM, Peter von Kaehne <refdoc at gmx.net> wrote:
>
>> Von: Jonathan Morgan <jonmmorgan at gmail.com>
>>   Is
>> this formatting what was intended, or is it just one of those things?
>
> I think this is a) an ancient module and b) a clear demonstration what happens when modules go in to the repos without validity check.
>
> Which brings me to a question - how do people do validity checks on IMP format modules?

Well, so long as there are lines with $$$KeyValue followed by lines of
content, the IMP format is satisfied.  Officially anything can be
squished into the values of the content.  The modules I produce
usually have ThML so I can control the display of the module and not
leave it to chance or default HTML rendering rules for SWORD's
OSIS->HTML transformation (but that goes back to the long, drawn-out
discussion about external CSS, etc which I won't rehash).  You could
also have plaintext, OSIS fragments or nearly anything else.  Thus,
there really isn't a way to check the validity of an IMP document
beyond looking for the $$$KEY lines.

>
> imp2gbs -> mod2osis -> validity check?

This will fail miserably as a) mod2osis only purports to work on
verse-keyed modules and b) it does not even work on verse-keyed
modules.  The near complete lack of interest in mod2osis and refusal
to support a basic regex extraction and search/replace in either the
C++ stl::string or sword::SWBuf means I have no desire to finish the
last few pieces of getting mod2osis working.  Anyone else is welcome
to tackle the remainder of the work by finding my branch on Launchpad.

>
> That is a bit round about and I have not done it on recent gen book modules, but simply tried checking by eye ball (but the texts were short).
>
> I think a simple imp2osis would be a useful addition. Maybe I write one, unless there is something I am missing.
>

Probably near trivial for Bibles, commentaries or lexica.  I believe
you're a Perl guy, so just iterating the lines of input, if the regex
/^\$\$\$(.+)$/ matches the line, then $1 is the new key value,
otherwise append the value of the current line to the currently open
item.  When you reach the next line that matches the regex, write out
an OSIS container with an osisID equal to $1 from the previous
matching line. Supporting the arbitrary deep nesting of text in a
general book would make it a bit more than trivial, but not by any
means difficult.  All of the forgoing provided, of course, the blurbs
of content in the imp file are either plaintext or valid OSIS
fragments, of course.

--Greg



More information about the sword-devel mailing list