[sword-devel] osis2mod and container elements
chrislit at crosswire.org
Wed Jul 2 09:53:45 MST 2008
DM Smith wrote:
> Chris Little wrote:
>> It looks like I've finally found _one_ thing OSIS can't encode, related
>> to this. OSIS is unable to encode a paragraph that spans consecutive
>> books (because <p/> can't contain <div/>).
>> There is actually a case in which this is needed, across the
>> 1Kings/2Kings boundary in the Hebrew.
> One can use the milestoned version of div. Mechanically, it is valid xml.
"<p><div/></p>" isn't valid in OSIS 2.1.1.
> This would be bad OSIS.
> The problem with milestones is that it allows for free-for-all documents.
> The only reason to have milestones is to have two different markup
> systems at the same time.
In the case I mention here, there is paragraph identification, but those
paragraphs in one instance cross a book boundary. It's another case of a
piece of real, inherent textual structure competing with a somewhat
arbitrary piece of structure that happens to over lap in a way not
permitted by strict containment.
> My rule of thumb:
> If the element cannot contain another, then using milestones to work
> around it is wrong.
> I think we need to have a smart OSIS validator that checks two things
> that an XML validator cannot.
> 1) Ignoring BSP, does BCV follow strict containment rules.
> 2) Ignoring BCV, does BSP follow strict containment rules.
> Basically, if one were to get rid of Book, Chapter and Verse divisions,
> all milestoneable elements should be able to be converted to their
> container form and the document be valid OSIS.
> Likewise, if one were to get rid of BSP elements and convert milestoned
> versions of Book, Chapter and Verse to their container form, the
> document should be valid OSIS.
That's fine, but in this case, with a Hebrew Bible, certain book
divisions are arbitrary, being based not on the textual structure so
much as on limits of the physical medium itself (i.e. scroll length).
I ended up simply making every book begin and end with a paragraph
boundary, but at the same time, I'm aware that I'm not encoding what is
actually in the text.
More information about the sword-devel