[osis-core] osisID: Summary and Proposal

Patrick Durusau osis-core@bibletechnologieswg.org
Sun, 30 Jun 2002 08:34:05 -0400


Greetings,

Sorry I did not get this stuff posted yesterday as promised. I got 
distracted by the XSLT problem of documenting the schema (unsuccessfully 
I might add) and have laid  it to one side for post-OSIS 1.1 work.

I have tried to separate out the issues into a series of posts and have 
tried to summarize the various pending posts along with a suggested 
syntax for the OSIS schema.

osisID: The "who am I" question

The traditional ID mechanism of SGML/XML carries with it certain 
syntactic constraints (such as not beginning with a number, therefore no 
1John with traditional ID) and is meant to be used with IDREF for 
internal document referencing (in part, that is not the full story). 
ID's must be unique in a document instance.

The decision in Rome was to abandon the traditional ID datatype so as to 
avoid changing the traditional practice of writing 1John to John1, for 
example.

Therefore, whatever the eventual syntax, the osisID is actually based on 
xs:string, so as to allow 1John, 2Kgs, etc.

Another question that has arisen is where do we use the osisID?

It has been suggested that we could use osisID on verse, Matt.1.5, etc., 
but that leaves us unable to identify larger divisions of Bibles, such 
as the fifth chapter of Job, suggesting Job.5 or even an entire book, Gen.

But, since OSIS will hopefully have a range of application beyond Bible 
texts, such a commentaries (both modern and ancient) as well as related 
works, the osisID must be applicable to any canonical referencing system 
and able to represent any level of that system. (This implies that an 
osisID can occur on any element that represents a division in that 
referencing system.)

At a minimum, I think that the osisID should be able to appear on any 
element that represents a division in the referencing system, thereby 
allowing books, chapters, verses, etc,  to identify themselves within 
such a system. The same would be true for Josephus or any other work 
with a known referencing system.

(In a separate post I will be treating documents, like the CEV, that 
reference but do not use (in my opinion) a canonical reference system.)

(Some of our confusion my be due to my conflating osisID and osisRef 
syntax in the schema and I will be trying to sort that out for your 
approval.)

Proposal (not all of this is new, just trying to state it all afresh and 
in one place):

osisID will be based on xs:string

osisID's will be used on elements that correspond to an identified 
reference system (work?)

osisID's will NOT have grain or range syntax (see next before responding)

Elements that do not correspond to a division in a reference system, may 
use begin/end attributes to indicate a continuous range of material 
based upon a reference system. (no discontinuous segments) (Reasoning: 
simple applications this will resolve to the beginning reference so the 
user at least gets close to the desired material)

osisID's will use a dotted syntax that represents the reference syntax 
in use.

Examples:

Matt.3.2 (refers to Book of Matthew, chapter 3, verse 2)

Matt.3 (refers to Book of Matthew, chapter 3)

Matt (refers to Book of Matthew)

Hmmm, question: How do we specify the treatment of a single token? I can 
see how we would do it for Bible texts, one token = book, token + "." 
token = book.chapter, and token + "." token + "." token = 
book.chapter.verse, but I am not sure how we can expand that to Josephus 
and all the other works that people might want to cite. (May be one of 
those questions we want to punt on at the present moment and just 
specify for Bible texts.)

Note that the osisID answers the question of "who am I" for an element 
(probably more precise, "what do I contain").

Suggested syntax:

<xs:attribute name="osisID" type="xs:osisID" use="optional"/>

<xs:simpleType name="osisID">
	       <xs:restriction base="xs:string">
		   <xs:pattern value="(([^\s]*\.){0,6}([^\s]*))"/>
	       </xs:restriction>
</xs:simpleType>


We would need to add begin/end global attributes to deal with the cases 
where the element contains less (or more) than the common division of 
materials.

<xs:attribute name="osisBegin" type="xs:string" use="optional"/>
<xs:attribute name="osisEnd" type="xs:string" use="optional"/>

Note  that I do not suggest these be type="osisID" since they can be 
used on notes for double ended attachment and notes may well wish to 
attach to elements than can bear no legitimate osisID, such as words in 
a verse. A word can have an ID (included by default for all elements) 
and a note could use those ID's (subject the usual rules for ID's to 
attach themselves to a text).

(Note that  we currently have work, cite, outwork and outcite, which I 
think are confusing to some degree.)

Forthcoming: osisRef, href  on <a>, subject attributes on links and 
other elements, etc.

Patrick

-- 
Patrick Durusau
Director of Research and Development
Society of Biblical Literature
pdurusau@emory.edu