[osis-core] Issue related to XPath expressions and osisID as a list.

Todd Tillinghast osis-core@bibletechnologieswg.org
Thu, 8 Aug 2002 10:34:53 -0600


I created a test document to test out the how XPath expressions would
work with out proposed osisID strategy as a list.

It seem we will need to use an expression with the "contains" function.
(If there is a better way please inform me.)

Sample document:

<?xml version="1.0" encoding="UTF-8"?>
<x osisID="T:Z.A Z.1">
	<y osisID="T:Z.1.1 Z.1.1">Z.1.1</y>
	<y osisID="T:Z.1.2 Z.1.2">Z.1.2</y>
	<y osisID="T:Z.1.2 T:Z.1.3 T:Z.1.4 Z.1.3 Z.1.4.A">Z.1.3 and
Z.1.4</y>
	<y osisID="T:Z.1.4 T:Z.1.5 Z.1.4.B Z.1.5">Z.1.4 and Z.1.5</y>
</x>

//*[contains(@osisID,"Z.1")] Will give us all <x> and <y> nodes when we
want just "Z.1"  but we can't say //*[@osisID="Z.1"] since the osisID
might be a list and we can't say //*[contains(@osisID," Z.1")] because
it might not be a list or might be the first item in the list.

Further if we want to match on an identifier from the default reference
system we will also get unwanted matches with reference system prefixed
identifiers.  //*[contains(@osisID, "Z.2")] will give us matches on both
"T:Z.1.2" and "Z.1.2", which gives us two elements in the above example
when only one is desired.

Questions:
1) Do we need to introduce some sort of syntax to indicate that an
identifier is from the default reference system OR just not have a
default? (osisID="T:Z.1.2 -Z.1.2" or osisID="T:Z.1.2 ~Z.1.2")

2) Do we need to introduce some sort of syntax to indicate that an
identifier has ended?  (osisID="T:Z.A! Z.1!"). 

A more complex strategy that makes a list of identifiers from the space
delimited attribute would solve the second problem.  (Or some equivalent
but more efficient trickery based on the length of the attribute string)

It is also possible to use the element type (<div>, <p>, <verse>) to
screen out unwanted matches.  But there would still be ambiguity between
<div> elements.  (Matt.1 and Matt for example would both be <div>
elements)  <div> elements could be differentiated based on the type
attribute.  (type="chapter" and type="book").

Thoughts?

Todd


2)