[osis-core] Harry on osisID's

Patrick Durusau osis-core@bibletechnologieswg.org
Sun, 30 Jun 2002 10:54:01 -0400


Harry,

A couple of quick responses based on my post earlier today on osisID.

Harry Plantinga wrote:

>Troy,
>
>Yes, my understanding is that the id="..." attribute is does the
>"I am this" function, and it can occur on any element. (I wouldn't
>want to have to put <verse> elements throughout augustine's confessions
>to be able to tell where various books, chapters, sections are.)
>
>I also agree with Steve's suggestion that it ought to have a more
>distinctive name, maybe osisID or sectionID or canonicalID.
>
>Q. Do they have to be unique within a document?
>
Raises the interesting question posed by Chris:

> But we need to allow for the two Mark.16.9's since a large number of 
> Bibles include both.


and the question about having both Greek and English texts of the same 
verse.

The osisID, since it has xs:string for a base, does not from an XML 
syntax standpoint have to be unique in a document. However, Harry's 
point about finding the element you want to point to is well taken.

In prose we should caution that in such cases, the osisID has to 
unambiguously identify the element.

In the case of Greek and English translations, prepending the edition 
(simply including work in the header would not resolve this, at least I 
don't think so without a mechanism to tie the osisID to a particular 
work in the header) to the osisID, thus:

<verse osisID="NA27.Matt.1.1"> for Greek, and <verse 
osisID="KJV.Matt.1.1">, noting that if uniformly done, I can parse for 
all the "Matt.1.1" without regard to edition if I so desire.

This also raises the question of validation, which I think should be 
addressed in supplemental schema modules. We could build upon Todd's 
earlier regex to validate osisID's that are held in a electronic edition 
of the Bible for instance. I would strongly prefer to handle content 
validation separate from the core modules as we cannot anticipate all 
the various editions that people would reasonably wish to use.

Validation of uniqueness, since we are not relying on ID and in any 
event, our identifiers are not valid IDs (because of prepended numbers), 
I think uniqueness would have to be validated by a Perl script or 
something similar. Would be a useful tool for a number of things.

>
>Q. Is there a way of defining the class of legal canonicalIDs for
>a document?
>
Yes, we had a massive regex early in the project for that very purpose. 
Problem is that if we put it in the core, then it is required for all 
documents, probably not the best choice.

Just redefine the permissible content for the syntax regex. So  the core 
module validates that you used the proper syntax and for a given 
project, you have a very short schema that invokes the core and 
redefines the regex to fit whatever you think are the proper values, not 
just for osisID but for osisRef and other attribute values.

Patrick


>
>-Harry
>
>>-----Original Message-----
>>From: owner-osis-core@bibletechnologieswg.org
>>[mailto:owner-osis-core@bibletechnologieswg.org]On Behalf Of Troy A.
>>Griffitts
>>Sent: Wednesday, June 26, 2002 11:03 PM
>>To: osis-core@bibletechnologieswg.org
>>Subject: Re: [osis-core] scripCom
>>
>>
>>I STILL feel we're all on different pages.  Let me just tell you what
>>page I feel we're all on and you can correct me.
>>
>>
>>I thought all "I am this" marking in a text were to use the <verse>
>>element.  These <verse> tagged sections of text would then become valid
>>targets of our <reference> tag.
>>
>>I think Steve has stated this same thing below:
>>
>> > A: The GNT, KJV, or any other version of Matthew 1:1:
>> >
>> >   <verse ref="Matt.1.1">
>> >
>> >   This is the "I am" case -- in effect, it means that this text claims
>> > to be some version of the identified passage, and should thus be
>> > appropriate as the target of any reference to that passage. This is
>> > faintly analogous to XML IDs.
>>
>>Question for Steve:  How would you markup "I am this" in Harry's example
>>below:
>>
>> > How do I say that an element is Augustine's confessions X.iii.5?
>>
>> >  <div id="X.iii.5"> together with something in the header which
>> >    says that this is augustine.confessions?
>>
>>Is this "I am this" tag what we were calling an *inRef*?
>>
>>
>>I think I may have been stating my position poorly in previous emails.
>>Let me restate some of my concerns.
>>
>>
>>I think Patrick is suggesting the we mark "I am this" with ANY element
>>we want using the ID attribute.  I think Harry may also be suggesting
>>the same.
>>
>>I think it is more coherent to keep the SAME tag everywhere (this is
>>where it sounds like Steve misunderstood me) for declaring "I am this"--
>>currently <verse>, and the SAME tag (though NOT the same as the "I am
>>this" tag) to designate a <reference>.
>>
>>
>>>  <ref word="Bible.NIV...." ref="Matt.1.1-Matt.1.4">
>>>
>>assuming:
>>	<reference work="Bible.NIV" cite="Matt.1.1-Matt.1.4">
>>
>>I looked thru the xsd and couldn't find ref= to be valid.
>>
>>
>>>  This is the reference or outRef case, which specifically means the
>>>text at this point is *not* claiming to be an edition of the identified
>>>passage, but a place that is relevant to understanding it (or vice
>>>versa). This is faintly analogous to XML IDREFs.
>>>
>>
>>This is the inRef/outRef pair I understood, as well: <verse> = inRef;
>><reference> = outRef
>>
>>
>>I think Patrick has a different definition of inRef/outRef, as stated
>>below by Patrick:
>>
>> > I think the inRef and outRef syntax is a hold over from when we were
>> > talking about validating the content of pointers and so it made a
>> > difference if you were pointing into an OSIS document (we could
>> > validate) versus pointing at a non-OSIS document from within one, we
>> > could not validate. I am not sure the distinction is meaningful with
>> > our current syntax.
>>
>>
>>Steve, if I understand your statement below, I think I would categorize
>>this different.
>>
>>>I think, though, that we also have two possible subtypes of B:
>>>
>>>   B1) This is a link to that passage, intended mainly to get you there
>>>
>>>   B2) This marks content that is generally "about" that passage
>>>
>>I would say that a <reference> tag should look something like this
>>excerpt from Matthew Henry's Commentary:
>>
>>
>>Thus doth God frustrate his enemies by frightening them, <reference
>>work="Bible.KJV" cite="Ps.9.20">Ps. ix. 20</reference>.
>>
>>A <reference> doesn't seem like it would include things like you list
>>below, but could.
>>
>>
>>>"I am a commentary (or portion) *about* Matt.1.1"
>>>"I am a sermon (or portion) *about* Matt.1.1"
>>>"I am a reader response annotation *about* Matt.1.1"
>>>"I am an exposition (or portion) *about* Matt.1.1"
>>>"I am a poeticRendering (or portion) *about* Matt.1.1"
>>>
>>
>>I think you and Patrick are both misunderstanding for what Harry is
>>asking.  Matthew Henry's Commentary is divided into section like:
>>
>>Matthew 28:1-10:
>>
>>The Resurrection.
>>      1 In the end of the...
>>[ more commentary on Matthew 28:1-10 ]
>>
>>
>>There are many of these verse by verse commentaries-- in fact every one
>>of the commentaries we have for our software is divided up exactly like
>>this.
>>
>>If I understand Harry correctly, he would like to tag these sections of
>>text with something like:
>>
>><div id="Matthew 28:1-10" type="scriptCom">
>>Matthew 28:1-10:
>>
>>The Resurrection.
>>       1 In the end of the...
>>[ more commentary on Matthew 28:1-10 ]
>></div>
>>
>>
>>
>>I told Harry that we used <verse> to mark these sections when exporting
>>MHC for the OSIS 1.0 spec.  e.g.
>>
>>
>><verseStart ref="Matthew.28.1" />
>><verseStart ref="Matthew.28.2" />
>><verseStart ref="Matthew.28.3" />
>><verseStart ref="Matthew.28.4" />
>><verseStart ref="Matthew.28.5" />
>><verseStart ref="Matthew.28.6" />
>><verseStart ref="Matthew.28.7" />
>><verseStart ref="Matthew.28.8" />
>><verseStart ref="Matthew.28.9" />
>><verseStart ref="Matthew.28.10" />
>>Matthew 28:1-10:
>>
>>The Resurrection.
>>       1 In the end of the...
>>[ more commentary on Matthew 28:1-10 ]
>>
>><verseEnd ref="Matthew.28.10"/>
>><verseEnd ref="Matthew.28.9"/>
>><verseEnd ref="Matthew.28.8"/>
>><verseEnd ref="Matthew.28.7"/>
>><verseEnd ref="Matthew.28.6"/>
>><verseEnd ref="Matthew.28.5"/>
>><verseEnd ref="Matthew.28.4"/>
>><verseEnd ref="Matthew.28.3"/>
>><verseEnd ref="Matthew.28.2"/>
>><verseEnd ref="Matthew.28.1"/>
>>
>>
>>This is cheezy, but how we have to markup Bibles.  Steve also thinks
>>this as per his quote, below:
>>
>> > Thus, for A one cannot say this is "Matthew 1:1-3"; if that is the
>> > case one must encode all 3 verse references there
>>
>>
>>And I was using this same method for marking up a commentary (MHC).
>>
>>
>>Just random thoughts and requests for confirmation,
>>
>>	-Troy.
>>
>>

-- 
Patrick Durusau
Director of Research and Development
Society of Biblical Literature
pdurusau@emory.edu