[osis-core] Segmenation.

Todd Tillinghast osis-core@bibletechnologieswg.org
Wed, 5 Jun 2002 12:36:50 -0500


There are several types/classes of hierarchies that could be segmented
using our schema.  Stepping back and looking at the big picture, it
seems to me that we need to determine which types/classes of hierarchies
can be segmented and which elements within each logical hierarchy can be
segmented.

Assuming that there can be more than one hierarchy segmented
simultaneously there needs to be clear guidelines that detail which
elements go together to reconstitute the logical elements that were
originally segmented.  And it would be helpful to there were "best
practices" regarding the identifiers used for xxxID, next, and previous
attributes of the segmented elements.

The trickiest piece seems to be lowest level container of "actual"
scripture text.  If we say that "actual" scripture text must always be
directly contained by <verse> (or within <abbr>, <foreign>,
<inscription>, <name>, or a simple <q> contained within <verse>) then
<verse> elements will ALWAYS hold the identifiers that allow us to
reconstitute "pure" verses that were segmented.  However, as it stands
it is POSSIBLE and even NATURAL to encode "actual" scripture text in
<lineGroup>/<line>, <q>, <list>/<item>, <p>, and <blockQuote> with out
any <verse> elements at all (or with a mixture including some <verse>
elements).

If we identify a "role" for elements that is "lowest level container of
'actual' scripture", then when reconstituting the text into logical
verses, elements acting in this "role" could be identified INDEPENANT of
their element name.  This would allow any of element acting in this
"role" to act the same as a <verse> element for the purpose of
identification.  In fact that is what we have said we would like to do
with <p> when it is exactly one verse.  This would eliminate the COMMON
cases where you see.

<line><verse verseID="...">...</verse></line>
and 
<p><verse verseID="...">...</verse></p>

replace them with
<line verseID="...">...</line>
and 
<p verseID="...">...</p>

but does not prevent
<p>
	<verse verseID="a">...</verse>
	<verse verseID="b">...</verse>
	<verse verseID="c">...</verse>
	<verse verseID="d">...</verse>
</p>

This does not PRECLUDE the more complicated cases where there are
multiple hierarchies segmented simultaneously.

<p pID="s" next="t">
	<verse verseID="x">...</verse>
	<verse verseID="a" verseNext="b">...</verse>
</p>
<p pID="t" prev="s" verseID="b" versePrev="a">...</p>

For an element to take on this proposed "role" they would simple assign
a value to their "verseID" attribute and the appropriate "verseNext" and
"versePrev" attributes.  If the same element were segmented through
their participation in another logical hierarchy then the element
specific xID attribute and next/prev attributes would be assigned
appropriate values.

SUMMARY:  There are a lot of elements that naturally take on the "role"
of "lowest level container of 'actual' scripture".  In order to simplify
allow a discrete set of elements to all perform the same role as
<verse>.  When reconstituting, simply go to the next element with an
attribute verseID with a value equal to the current nodes verseNext and
a versePrev with a value that matches the current nodes verseID.  Other
segmentation would require the element name to be the same as it other
parts.  (This makes a special case out of <verse> which simplifies
encoding and element construction.)

PROPOSAL:  Create an abstract type that defines the attributes and
possible child elements of an element acting in the proposed "role".
Derive all elements that can act in this role from this element. 

Todd