[sword-devel] XHTML Rendering of OSIS Reference Doc - Whitespace

Troy A. Griffitts scribe at crosswire.org
Wed May 8 13:46:55 MST 2013

Hi Greg, thanks for the ideas.   Have a look at this thread:


But let's please keep this thread pragmatic to solving the problem 
holding up this release :)

Any suggestions for WHAT the output should be?

I wasn't thinking of even including the <div> tags which are there 
simply to mark versification division.


On 05/08/2013 01:15 PM, Greg Hellings wrote:
> Off the cuff here, it seems the issue is the difference in semantics 
> of <div> between OSIS - where it marks a structural division within a 
> text which can be of many different levels and layers and in XHTML 
> where it represents a box of block-style layout which defaults to 
> being the full width of its container.
> Producing "proper" output seems like it is only feasible if we are 
> handling a block of output. The sample you have contains 3 sID and 1 
> eID attributes on div elements. And they are self-closing elements, 
> which will typically render as vertical whitespace in XHTML. Ideally, 
> any with sID="..." would be rendered with <div> and any with eID="..." 
> would be rendered with </div>.
> The problem becomes rendering a list of 30 (or however many) verses, 
> if each is rendered separately by our filters. If <div sID="gen1"/> is 
> within Gen.0.0 but <div eID="gen1"/> is at the end of the chapter, 
> which appears to be the case here, then we don't properly want to 
> generate something like
> <div>
>  Gen.0.0
> </div>
>  Gen.1.0
>  Gen.1.1
>  Gen.1.2
>  ...
> But rather we want something like
> <div>
>  Gen.0.0
>  Gen.1.0
>  Gen.1.1
>  Gen.1.2
>  ...
> </div>
> At least when not dealing with inter-linear versions, we do.
> In BibleTime we have discussed how to properly handle this and came up 
> with an interesting solution that we engineered but never implemented. 
> Our thought was to store information along with each verse which 
> includes a pre- and post- verse markup. This would need to become part 
> of the OSIS import process, and it would track the "semantically" open 
> elements such as <div sID="gen1" /> which, by XML standards are no 
> longer open but the OSIS semantics designate that div is open until 
> <div eID="gen1" /> is encountered. This would be in addition to the 
> actually open XML elements.
> Every verse entry would then keep a store of the open elements at its 
> start and those still open at the end of the entry. Then, when an 
> arbitrary range is selected for rendering - say, Genesis 1:15-25 - a 
> single, complete OSIS document could be generated by taking 
> Gen.1.15.pre and appending that to the text of Gen.1.15-Gen.1.25 and 
> then appending Gen.1.25.post. Then a proper filter can operate on the 
> entire block of text to generate correctly wrapping <div> ... </div> 
> and other markup.
> Perhaps I overstepped the answer of what the above markup _should_ be, 
> but I just wanted to toss out the solution that the BT folks have put 
> brain power on to address the problem of stray open-and-close <div> 
> elements. These seem to be the main problem in the sample you have 
> presented. Again, there was never an implementation of this, as it 
> would need to essentially re-import Sword module data to generate the 
> pre- and post- data, and that went beyond the scope of any work 
> heretofore on BibleTime.
> --Greg
> On Wed, May 8, 2013 at 2:31 PM, Troy A. Griffitts 
> <scribe at crosswire.org <mailto:scribe at crosswire.org>> wrote:
>     OK guys,
>     I'm starting work on this. I've setup a test in our testsuite for
>     whitespace against our OSIS reference doc. Here are the links:
>     test:
>     http://crosswire.org/svn/sword/trunk/tests/osistest.cpp
>     (whitespace test added at the end)
>     OSIS Reference Document:
>     http://crosswire.org/svn/sword/trunk/tests/testsuite/osisReference.xml
>     Before I start any work, I want to show what output we get
>     currently. It is obviously seriously messed up.
>     This is from the new XHTML filter set (which is based on the
>     HTMLHREF filter set). The first obvious issue is the passthru of
>     the OSIS <div> elements as-is. Anyone like to suggest exactly WHAT
>     they would like as output from the XHTML filterset from the OSIS
>     Reference document here? Current output below:
>     <div sID="gen1" type="bookGroup"/> <h3>Old Testament</h3> <div
>     osisID="Gen" sID="gen2" type="book"/> <h3>THE FIRST BOOK OF MOSES
>     CALLED GENESIS</h3> <div sID="gen3" type="section"/>
>     <h3>Introduction and Outline</h3> <br /> This is the <b>Book of
>     Genesis</b>, the <i>first</i> book in the Bible. It may be
>     outlined as follows: <br /><br /> <ul> <li><i>1</i>Creation of
>     Heaven and Earth, 1:1-2:4a</li> <li><i>2</i>Creation of Man and
>     Woman, 2:4b-25</li> <li><i>3</i>Fall, 3:1-24</li> <li>...</li>
>     </ul> <br /><br /> Tables work like this: <b>Column 1 Label</b>
>     <b>Column 2 Label</b> Column 1, Row 1 Column 2, Row 1 Column 1,
>     Row 2 Column 2, Row 2 <br /><div eID="gen3" type="section"/>
>     <div sID="gen7" type="majorSection"/> <h3>From Creation to Abraham
>     (1:1--11:9)</h3>
>     [ Genesis 1:1 ] In the beginning God created the heaven and the
>     earth. <br />
>     [ Genesis 1:2 ] Text of verse 2.
>     _______________________________________________
>     sword-devel mailing list: sword-devel at crosswire.org
>     <mailto:sword-devel at crosswire.org>
>     http://www.crosswire.org/mailman/listinfo/sword-devel
>     Instructions to unsubscribe/change your settings at above page
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/sword-devel/attachments/20130508/fa1c65b1/attachment-0001.html>

More information about the sword-devel mailing list