[osis-core] Self-identificatoin support for <verse> elements that span multiple verse identifiers and <verse> elements that are only a part of a verse identifier

Todd Tillinghast osis-core@bibletechnologieswg.org
Tue, 23 Jul 2002 11:10:50 -0600


Related to the problem presented by the previous two posts, it seems
there are three possible options we could take to support this problem.
OPTION 1:
Make osisID a list of osisIDs as we currently have defined them.  Create
a second attribute that UNIQUELY identifies a <verse> element for the
purpose of segmentation.  This could work nicely because we don't really
want a unique osisID for each verse but seem to want to provide a
"standard" and reliable mechanism for verses to self-identify
themselves.  If the assumption above is made then there are cases where
a single "block of text" (verse) is only reliably identified by several
identifiers.  Further, there are cases where more than one verse are
really identified by the same identifier.  In the second case the
"logical verse" is REALLY segmented across more than one verse element.
It might be reasonable to provide a prev/next mechanism for the logical
verse segmentation.  

EX:
<verse osisID="Matt.1.1">...</verse>
<verse osisID="Matt.1.2 Matt.1.3 Matt.1.4 Matt.1.5 Matt.1.6"
logicalIDNext="Matt.1.6 Matt.1.7 Matt.1.8 Matt.1.9 Matt.1.10 Matt.1.11"
logicalIDSplitNext="Matt.1.6">...</verse>
<verse osisID="Matt.1.6 Matt.1.7 Matt.1.8 Matt.1.9 Matt.1.10 Matt.1.11"
logicalIDPrev="Matt.1.2 Matt.1.3 Matt.1.4 Matt.1.5 Matt.1.6"
logicalIDSplitPrev="Matt.1.6">...</verse>
This will allow for almost all verses to be identified by a single
identifier but would also accommodate the ugly cases where a logical
verse as defined by the reference system is split into more than one
<verse> element as well as when more than one verse is combined into a
single <verse> element.  All this will while providing a mechanism for
<verse> elements to be self-identified by the more "standard" and common
identifiers.  

Better attribute names should be chosen.

(Note: It is possible that the same <verse> element and its associated
text could be retrieved more than once if the accessing process was
"fetching" explicitly using two or more of the identifiers that a
<verse> element uses to self identify.  It is also possible that there
would be more than one match to a single identifier.  But in these cases
that is the nature of translation.)

OPTION 2:
Introduce two new elements that derive from <verse>.  <marcoVerse> and
<microVerse> for cases where the current verse element covers several
verses or a fragment of a verse as described by the prevailing reference
system "intended" by the translator.  If/when the translator NEEDS a
different mechanism than is provided by <verse> then they can use the
<macroVerse> or <microVerse> mechanism.  

Ex:
This is actually a many-to-many mapping between what was produced by the
translators and the "standard" reference system.  Because verse Matt.1.6
is actually split between the two text blocks produced by the
translators as "Matt.1.2-6a" and "Matt.1.6b-11".  
<macroVerse macroID="Matt.1.2-Matt.1.6@enum:a" osisID="Matt.1.2 Matt.1.3
Matt.1.4 Matt.1.5 Matt.1.6"
presentationID="Matt.1.2-6a">....</macroVerse>
<macroVerse macroID="Matt.1.6@enum:b-Matt.1.11" osisID="Matt.1.6
Matt.1.7 Matt.1.8 Matt.1.9 Matt.1.10 Matt.1.11"
presentationID="Matt.1.6b-Matt.1.11">...</macroVerse>

This still allows for segmentation of the macroVerse based on its
macroID.


OPTION 3: 
Force these cases to self-identify using an identifier from a
"work-specific" reference system and leave users to find and use an
appropiate mapping mechanism between the "work-specific" reference
system and the common reference system that the document generally
supports.


I personally really favor OPTION 1.

None of these options provide any support for marking different starting
and ending points for verses from differing reference systems nor does
either option provide any support for marking starting and ending points
for verses that differ from the starting text and ending text points
within a <verse> element.  Support for this sort of precise point
marking is only provided through the <milestone> element and is
secondary to the primary verse self-identification mechanism provide by
osisID.

Thoughts?

Todd