[osis-core] OSIS pointers and references

Harry Plantinga osis-core@bibletechnologieswg.org
Sat, 6 Jul 2002 07:37:50 -0400


I've struggled for several hours now in trying to achieve clarity on this
issue of osis references, pointers, IDs, IDSchemes, etc. (without total
success so far, I might add). I am going to try to list the types that it
seems to me are in use and the uses to which they may be put. Some of them
don't yet have names, so I will have to make them up. I am also using some
names differently from the way they have previously been used -- e.g. osisID
in this scheme is [osisIDScheme:]osisIDElement. Consider this a proposed
framework; feel free to correct it, beat on it, modify it, scrap it,
whatever -- I just want to have clarity about these issues.

A major issue: does a <work> element only refer to an OSIS document that
implements a particular osisIDScheme? If so, you can't use this element as a
general-purpose bibliography item container because not all references will
always be available in OSIS format. If not, it's unclear what
kempis_imitation:IV.1 means when kempis_imitation isn't a defined
osisIDScheme--kempis_imitation could be an HTML document, a plain text
document, or even a dead-trees-format-only document. I'm going to assume
that a <work> element is intended as a general bibliography item, whether or
not available in OSIS format. Thus, osisRefs may point to things other than
OSIS documents, maybe even to print-only editions. The location IDs used
will have various, format-dependent meanings.

The alternative, using <work> elements only for OSIS texts, would mean not
using a <work> element as a general bibliography entry. All you'd really
need to specify in the work element is are the dot-delimited parts of the
osisIDScheme.  Title, Author, edition, etc. would be irrelevant (unless you
are giving hints to the server on which bible.lxx to select).

Types of identifiers and pointers:

1.  osisIDScheme -- a dot-delimited string, like bible.lxx.en or
augustine_confessions.pusey. Refers to a "versification" or osisID scheme
defined elsewhere. That is, there must be a document on the
bibletechnologies.org website defining this scheme, or this document must be
declared to define the scheme. For the purposes of the OSIS 1.1 schema, it
is an opaque string, though meanings of dot-separated parts will be defined
elsewhere.

2.  osisIDSchemeRef -- a dot-delimited string, like bible.lxx..en-us, that
refers to any one of a class of osisIDSchemes.  An osisIDScheme matches an
osisIDSchemeRef if all the tokens present in the osisIDSchemeRef match the
corresponding tokens in the osisIDScheme. An osisText server, given an
osisIDSchemeRef such as "bible..en-us", should be able to find a document
with a matching osisIDScheme if there is one on.

3.  osisText -- a particular OSIS document

3.  osisTextID -- a string uniquely identifying an osisText. It may be a
particular version of a particular edition of Kempis' Imitation of Christ,
Troy's version 3.1 of his edition of the LXX, etc. Perhaps it should have
some internal structure, such as augustine_confessions_1.01. (Or perhaps
versioning is important enough that it should have its own element?)

4. <work workID="xxx"> -- a header element that identifies a particular work
or any work matching an osisIDSchemeRef (e.g. bible.lxx..en-us) and gives it
an internal workID (e.g. lxx). It may identify by author/title/edition,
osisTextID, ISBN, URL, etc.  It may say "don't care" about some attributes.
[should [work] also have a <version> element by which you can request a
particular version of a work?]

5.  osisIDElement -- like a name token, but can start with a digit. Must be
unique within an osisIDScheme. Pagebreak milestones (if present) are
conventionally identified as Page_32, Page_xii, etc. [previously called an
osisID]

6.  osisID -- [osisIDScheme:]osisIDElement. If osisIDScheme is omitted, a
default value specified in an <osisText> attribute is assumed.  Thus, an
osisID says "this element contains the osisIDElement part of osisIDScheme".

7.  grain -- e.g. cp:32(Hello World). [Whatever the latest grain syntax
allows.] Refers to a finer location within an osisID.

8. osisRef: [workID:]ID@grain[-ID@grain] -- pointer to a location or range
in a text. If workID is not present, the current document is assumed. Note
that the prefix is a workID, not an osisIDScheme. There may not even be an
electronic edition of this work. You might refer to Matt.1.3 in Troy's
version 3.1 of the LXX, Matt.1.3 in any English LXX, or Page_32 of volume 7
of the ninth edition of the Encyclopedia Britannica. If the workID is not an
osisIDScheme, the ID refers to an "ID" (variously defined) in that document.
E.g. if this work is an HTML document, it refers to an element in that
document with that ID. I the work is a print edition, could refer to
Page_vii.
 - note: need a <volume> element in the <work> element to specify which
volume of a set? Do we need a way of specifying volume, issue, and page
number for a journal article?
 - does it really make sense to have osisRefs to text, print editions? If
not, how do we refer to such things?

---------------------Alternative A---------------

<work> defines an osisIDSchemeRef, and it can't be used for non-OSIS texts.
You don't need a separate osisIDSchemeRef type.  That is, <work> specifies
that I want any one document from a class of osisIDSchemes, e.g.
bible.lxx..en-us, but it uses an extended syntax rather than a dot-delimited
string, e.g. <language>en-us</language>, etc. The dot-delimited syntax is
only used for osisIDSchemes, not osisIDSchemeRefs (which go away). The
workID is an internal name for any matching document. Both osisID and
osisRef use optional workID: prefixes. Thus, in a <work> element you say
something like "I want any english version of the LXX and I am going to call
it 'lxx'." You can use it or later as osisRef "lxx:Matt.1.3".

This would mean that osisRefs can't refer to anything but an osisText, and
the <references> section of the header isn't a general-purpose bibliography.
However it would have fewer issues concerning what exactly osisRefs mean
when pointing to non-OSIS texts.

Maybe this makes more sense than trying to define more general osisRefs,
though I hate to give up on the <references> section as a general
bibliography.

------------------issues----------------

Should an osisText have an osisTextID, which is unique for
this text? Should it incorporate a version number or shoudl there
be a separate version element?

Should the "osisWork" attribute in <osisText osisWork="bible.lxx">
be renamed something like defaultOsisISScheme?

Should there be a way of saying that this osisText implements/
defines an osisIDScheme?

Possible osisText attributes:
- defaultIDScheme
- implementsIDScheme
- definesIDScheme