[sword-devel] OSIS Schema

Greg Hellings greg.hellings at gmail.com
Sun Oct 14 18:50:43 MST 2012


It almost seems like having a regular schema is almost worthless. Due
to the desire for milestoneable objects, it will almost always be
trivial to produce "valid" OSIS documents that are not semantically
meaningful (e.g. 'Colorless green ideas sleep furiously'). Perhaps the
focus should be on providing simple (to run), straightforward (in
their output) scripts that report back semantically ridiculous but
validated XML constructs in OSIS?

I'm not saying that having more targeted schemas isn't a good idea,
but I have on more than one occasion produced XML gibberish that
validated just fine an resulted in some amusing module mishaps because
it was impossible to capture that I was semantically far off the mark.
So perhaps less emphasis should be placed on them than on
pre-processing/validating tools which can have more meaning and more
helpful output for the user.

--Greg

On Sun, Oct 14, 2012 at 8:21 PM, DM Smith <dmsmith at crosswire.org> wrote:
>
> On Oct 14, 2012, at 9:19 PM, Daniel Owens <dcowens76 at gmail.com> wrote:
>
>> On 10/14/2012 06:19 PM, DM Smith wrote:
>>> The OSIS schema is a bit convoluted how it allows two different document models. I've been thinking that it might make sense to have three distinct OSIS schemas. The one we have now would be one of the three. The other two would be for the other two document models.
>>>
>>> The problem I'm coming up against is that because nearly every "container" element has a milestone form, everything goes. Some examples:
>>> 1) milestoned elements allows for overlapping containers. e.g. <div sID="x"/><lg sID="y"/><div eID="x"/><lg eID="y"/>
>>> 2) text is allowed where it should not be. e.g. <lg sID="x"/>text<lg eID="x">
>>> 3) elements are allowed where they should not be. e.g. <div><l>...</l></div>
>>>
>>> When these things happen, the SWORD and JSword engines may not produce the desired results and they are very hard to diagnose.
>>>
>>> For best practice in creating an OSIS document, we recommend that book, chapter, div, lg, l, .... not be milestoned,  and that verse elements be milestoned. We call this BSP (Book/Section/Paragraph).
>>> I think one of the schemas should properly represent this.
>>>
>>> The following allow for milestones:
>>> abbr
>>> chapter
>>> closer
>>> div
>>> foreign
>>> lg
>>> l
>>> q
>>> salute
>>> seg
>>> signed
>>> speech
>>> verse
>>>
>>> The "rule" is that within a document an element be used either as milestoned form or as container form, but not both.
>>>
>>> The <div> element is funny in that the schema requires that the div not be milestoned, but allows for milestoned markup. I take this to mean that the combination of an element with the value of type should be used to determine the form.
>>>
>>> Regarding a BSP OSIS schema, the verse element would be milestonable.
>>>
>>> Of the other elements above, I don't see that one would ever have to milestone abbr, closer, foreign, salute, signed.
>>>
>>> "q" for quotes serve two purposes: marking quotations (what the marks are and where they go) and designating who is speaking. The latter is used to mark the words of Jesus. The <milestone> element is a mechanism to mark continuing quotes. These need to be allowed to be milestoned. It is highly likely that a richly structured document will have at least one occurrence that requires it.
>>>
>>> Since speech is an analogous form for q, it will need to be milestoneable.
>>>
>>> Poetry (lg, l) can certainly cross chapters, but it can be artificially started and stopped so as to not cross boundaries.
>>>
>>> seg is problematic. The OSIS manual defines it as part of <word> [sic, they meant <w>] and for marking inline text with a type, e.g. type="benediction". I don't see that it needs to be milestoned.
>>>
>>> I've seen one example of where chapter is crossed by div (last verse of John 7 and first 11 verses of chapter 8, marked as "problematic text"), but I'm not sure that it needs to be milestonable.
>>>
>>> Any thoughts?
>>>
>>> In Him,
>>>      DM
>>>
>> So, would the three models be:
>> BSP
>> BCV
>> Div (used in genbooks)?
>
> The first two, not div. The third would be the current schema.
>
>>
>> I don't have a strong opinion about this, but I wanted to make sure I understood what you were proposing.
>>
>> Daniel
>>
>> _______________________________________________
>> sword-devel mailing list: sword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/sword-devel
>> Instructions to unsubscribe/change your settings at above page
>
>
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page



More information about the sword-devel mailing list