[sword-devel] Titles and other Inter-verse material
dmsmith at crosswire.org
Sun Jul 22 14:38:18 MST 2012
I had mentioned earlier that I'd send something on this. These thoughts are from working on a few modules and on osis2mod.
There are several things that play into this:
1) Titles: These use the <title> element for their content. This has been the focus of much of the discussion. The Show/Hide Heading filter was designed with this in mind. Later, the ability to always show canonical titles (e.g. Psalm titles) was added.
2) Rich content in titles. Canonical titles are the premier example of this, having Strong's Numbers and Morphology info; Markup for Divine name; notes, ....
3) Sections. The OSIS spec suggests that a title should be within and at the top of a <div type="section"> element. They typically surround verses. That is the <div> and </div> should be between verses.
4) Paragraphing. The <p> element typically surrounding verses. Often they are in sections. Likewise the <p> and </p> should be between verses. (Note: <p/> (empty paragraphs) is just plain bad form.)
5) Split verses. A verse may be split by titles, sections and paragraphs. I don't particularly like it, but I've seen it. I could very will be wrong, but I think it is an artifact of a translation using a KJV versification but disagreeing where the verses really start and end.
6) Poetry. This uses three elements <lg>, <l> and <lb> (from memory) to create a group of stanzas where each might be split over several lines. Poetry often starts in the middle of a verse. And may end within a verse. But it is not uncommon for it to surround verses. That is to say we can expect these elements between verses too.
7) Arbitrary interverse content. Introductory material can be pretty much anything. Typically we expect this at the beginning of Bible books and even chapters. It is not unreasonable for it to occur between verses within a chapter, as in a study Bible.
8) Block element handling. HTML agents have special handling of nested block elements. Simplistically, a block element start that follows one or more block starts is treated specially, often coalescing vertical whitespace. If the block element has particular visual styling (margins, padding, indentation, ...), it is applied. I mention this because there have been numerous comments about too much vertical whitespace. In handling vertical whitespace, I think a distinction needs to be made between structural markup that needs to be retained even if titles, headings, introductions are hidden.
9) osis2mod transforms from BSP (Book/Section/Paragraph) into BCV (Book/Chapter/Verse). This allows for a verse in isolation to be valid xml. This makes <div> (and other block elements) to no longer behave like HTML containers.
10) x-preverse markup. Currently osis2mod is using (where %d is a matched pair):
<div type=\"x-milestone\" subType=\"x-preverse\" sID=\"pv%d\"/>...pre-verse content...<div type=\"x-milestone\" subType=\"x-preverse\" eID=\"pv%d\"/>
Note: These are merely milestones and should never produce whitespace of any kind. The only purpose of the construct is to know what is before the verse. A problem is that the Show/Hide Headings filter treats this as something that can be toggled. It may contain much that needs to be retained. (see 8)
11) Retention of all markup (except the <verse> element) in the order that it appears in the input. Module authors are going far beyond a simple markup of just the basic verse content. We've published in the wiki best practices in marking up various things. If followed it should have a reasonable rendition in a module. (Please, let's not diverge on to the verse element discussion. It doesn't change the problem at hand.)
Troy suggested bolstering the test case. I'm not at all sure how to go about doing that. Especially the expected output.
Hope this helps.
In His Service,
More information about the sword-devel