[osis-core] osis_0109 Arrives! Very close to 1.0!!!

Todd Tillinghast osis-core@bibletechnologieswg.org
Sun, 14 Apr 2002 08:04:57 -0500


> Guys,
>
> Attached is osis_0109.zip.
>
> Changes:
>
> 1. Tried to uniformly camel case and make attribute syntax uniform,
> i.e., noteStart, noteEnd, refStart, refEnd. Notice of places where I
> missed or goofed up appreciated.

I like the new look and have not seen any goofs.

>
> 2. Todd: Note that OSISID is now NMTOKENS!  This gets you:
>
> a. Book names that start with number!, 1Cor, 2Kings, etc.
This is a good thing but precludes references from being IDs in the future.
Useful in PSIs and even in milestones.
>
> b. Can have multiple NMTOKENS, i.e., OSISID="Gen.17.17 Gen.17.18",
> validated against the regex for referenceType.
This seems BAD to me.  This still does not handle the case were there truely
is a different name explicitly assigned by the translator that has a
different meaning than traditional reference systems use.
>
> 3. OSISIDREF is now NMTOKEN (should only have a single ref back to a
> starting point, not IDREF but probably not used that much anyway.)
>
> 4. Now have refWork attribute on <text>. Validates against referenceWork
> simpleType (which is redefined by osisScripture_0109.xsd (this gets you
> the default prefix for your other references.)
>
> 5. Note now has refWork attribute (so can point outside to a particular
> referenceWork).
Why just in note?  I thing we should be able to use a non-default reference
anywhere we are using a referenceType.  At least this should be possible in
<reference>, <figure>, and possibly a few key other elements.
>
> 6. Regex no longer has Bible.KJV, etc., handled by referenceWork
I like the old way better see #5 above.
>
> 7. Completely reformed regex expressions, recall from my earlier post:
This fixes the totally incorrect previous version!  It seeme that we should
preclude some characters that are allowed by "/c".  (".", "-", and possibly
a few others.)  This will reserve the right to use them later.
Also I'm not sure we should have a "default" regex at all in OSISCore since
it gets ORed with the regexs in the redefined versions and negates the
opportunity for real validation.
>
> > 1. Regexs:
> >
> > Generally see: http://www.w3.org/TR/xmlschema-2/#regexs
> >
> > ReferenceType
> >
> > Now reads: ([^.]+)((.[^.]+){0,})?
> >
> > Note that "^" begins a negative character group.
> >
> > Note that the "." character in XML Schema is the equivalent of:
> > [^\n\r] : any character except newline
> >
> > So, [^.] means only newline (excludes all other characters)
> >
> > Or more formally from the standard:
> >
> > [Definition:]   A * negative character group* is a ·positive character
> > group· <http://www.w3.org/TR/xmlschema-2/#dt-poschargroup> preceded by
> > the |^| character. For all ·positive character group·
> > <http://www.w3.org/TR/xmlschema-2/#dt-poschargroup> s /P /, ^/ P/ is a
> > valid *negative character group*, and / C(^P)/ contains all XML
> > characters that are /not/ in /C(P)/ .
> >
> > *Negative Character Group*
> > |[15]   | | negCharGroup| |   ::=   | |'^' posCharGroup
> > <http://www.w3.org/TR/xmlschema-2/#nt-posCharGroup> |
> >
> >
> > I assume the intent of the expression is:
> >
> > 1. Any legal namestart character, followed by,
> > 2. Any legal name character, followed by,
> > 3. literal "." character, followed by
> > 4. one or more groups of legal name characters separated by a literal
"."
> >
> > If that is the case, I would suggest that we re-write ReferenceType to
> > read:
> >
> > ([\i]([\c])*\.((\c)*\.)?
> >
> > Note that \i = any legal initial name character, \c = an y legal name
> > character, \. = literal "." or full stop
> >
> > Additionally, since we have compScriptureReferenceType (I treat that
> > regex below) not sure what ReferenceType is getting us in terms of
> > validation? Structure of the references? Perhaps, would welcome some
> > discussion on this and WorkType (next).
> >
> > (BTW, schema regexs always match from the beginning of the line so no
> > need to anchor.)
> >
> > WorkType:
> >
> > Now reads: ([^.]+(.[^.]+)
> >
> > Same problems as above with "^" and invoking of literal full stop.
> >
> > Is the intent of this expression the same as ReferenceType?
> >
> > In other words to:
> >
> > 1. Any legal namestart character, followed by,
> > 2. Any legal name character, followed by,
> > 3. literal "." character, followed by
> > 4. one or more groups of legal name characters separated by a literal
"."
> >
> > if so, why would I want both of them? For that matter, the more I
> > think about it, I am not sure what function either one would serve, at
> > least in light of our not declaring a set of references to other works.
> >
> > Suggestion: Why not settle on an outside reference pointer that
> > subclasses xs:string the way we have for enumerated values on
> > attributes. You can at this point declare whatever other pointers you
> > like, but prepend "x-" to them? That would allow us to later (probably
> > by the Fall release of translator and publisher modules, to declare
> > references like compScriptureReferenceType that provide validation of
> > at least part of the reference?
> >
> > compScriptureReferenceType:
> >
> > Now reads (in part) ((...All Book Names...))((.[^.]+){0,}))?
> >
> > Same problems as above with "^" and invoking of literal full stop.
> >
> > In other words to:
> >
> > 1. Book Name, followed by
> > 2. literal "." character, followed by
> > 3. any digit or letter (one or more) (question, do we need letter for
> > some Bible references?), followed by
> > 4. literal "." character, followed by
> > 5. any digit or letter (one or more) (question, do we need letter for
> > some Bible references?), followed by (optional)
> >
> > If that is the case, would the following work?
> >
> > ((...All Book Names...))\.[A-Za-z0-9]*(\.[A-Za-z0-9]*)?
> >
> > Note that this expression requires book name plus chapter, could
> > someone want to just refer to Matthew?
>
> If I am completely off base on my reading of XML Schema regex
> expressions please point me to the correct information but it sounds
> like [^.]  which is in all the expressions I corrected, excludes all
> letters but newline? Fairly sure that is not what was intended.
>
> Recall that XML Schema regex expressions always, automatically, never do
> differently, bind at the beginning of the string, so "^" to match the
> beginning of a string is not required (Not to mention has a different
> meaning than the way it is used in Perl/sed/awk, etc.).
>
> Chime in now on these or any other issues because save for typos, I
> would like to consider this version a code freeze so tomorrow I can work
> on documentation to insert into the schema and Todd/Chris/Troy can start
> getting some sample texts together for a Monday release.
>
> I'm about to take a break for a couple of hours but will be back online
> later today and early this evening.
>
> Looks really good guys!
>
> Patrick
>
>
>
>
>
> >
> --
> Patrick Durusau
> Director of Research and Development
> Society of Biblical Literature
> pdurusau@emory.edu
>
>