[osis-core] osis_0109 Arrives! Very close to 1.0!!!

Todd Tillinghast osis-core@bibletechnologieswg.org
Sat, 13 Apr 2002 18:15:26 -0500


I have only looked at the schema and have not run any test, etc..  At first
glance it looks much cleaner.  (like the consistent start and stop names and
reworked attribute group.)

I will look further at referenceWork later and dig into how to use it and
how to misuse it, etc...

Also have to think more about the NMTOKENS.  Just haven't had that much time
to look that this yet.

Probably after dinner.

Todd

> -----Original Message-----
> From: owner-osis-core@bibletechnologieswg.org
> [mailto:owner-osis-core@bibletechnologieswg.org]On Behalf Of Patrick
> Durusau
> Sent: Saturday, April 13, 2002 1:42 PM
> To: osis-core
> Subject: [osis-core] osis_0109 Arrives! Very close to 1.0!!!
>
>
> Guys,
>
> Attached is osis_0109.zip.
>
> Changes:
>
> 1. Tried to uniformly camel case and make attribute syntax uniform,
> i.e., noteStart, noteEnd, refStart, refEnd. Notice of places where I
> missed or goofed up appreciated.
>
> 2. Todd: Note that OSISID is now NMTOKENS!  This gets you:
>
> a. Book names that start with number!, 1Cor, 2Kings, etc.
>
> b. Can have multiple NMTOKENS, i.e., OSISID="Gen.17.17 Gen.17.18",
> validated against the regex for referenceType.
>
> 3. OSISIDREF is now NMTOKEN (should only have a single ref back to a
> starting point, not IDREF but probably not used that much anyway.)
>
> 4. Now have refWork attribute on <text>. Validates against referenceWork
> simpleType (which is redefined by osisScripture_0109.xsd (this gets you
> the default prefix for your other references.)
>
> 5. Note now has refWork attribute (so can point outside to a particular
> referenceWork).
>
> 6. Regex no longer has Bible.KJV, etc., handled by referenceWork
>
> 7. Completely reformed regex expressions, recall from my earlier post:
>
> > 1. Regexs:
> >
> > Generally see: http://www.w3.org/TR/xmlschema-2/#regexs
> >
> > ReferenceType
> >
> > Now reads: ([^.]+)((.[^.]+){0,})?
> >
> > Note that "^" begins a negative character group.
> >
> > Note that the "." character in XML Schema is the equivalent of:
> > [^\n\r] : any character except newline
> >
> > So, [^.] means only newline (excludes all other characters)
> >
> > Or more formally from the standard:
> >
> > [Definition:]   A * negative character group* is a ·positive character
> > group· <http://www.w3.org/TR/xmlschema-2/#dt-poschargroup> preceded by
> > the |^| character. For all ·positive character group·
> > <http://www.w3.org/TR/xmlschema-2/#dt-poschargroup> s /P /, ^/ P/ is a
> > valid *negative character group*, and / C(^P)/ contains all XML
> > characters that are /not/ in /C(P)/ .
> >
> > *Negative Character Group*
> > |[15]   | | negCharGroup| |   ::=   | |'^' posCharGroup
> > <http://www.w3.org/TR/xmlschema-2/#nt-posCharGroup> |
> >
> >
> > I assume the intent of the expression is:
> >
> > 1. Any legal namestart character, followed by,
> > 2. Any legal name character, followed by,
> > 3. literal "." character, followed by
> > 4. one or more groups of legal name characters separated by a
> literal "."
> >
> > If that is the case, I would suggest that we re-write ReferenceType to
> > read:
> >
> > ([\i]([\c])*\.((\c)*\.)?
> >
> > Note that \i = any legal initial name character, \c = an y legal name
> > character, \. = literal "." or full stop
> >
> > Additionally, since we have compScriptureReferenceType (I treat that
> > regex below) not sure what ReferenceType is getting us in terms of
> > validation? Structure of the references? Perhaps, would welcome some
> > discussion on this and WorkType (next).
> >
> > (BTW, schema regexs always match from the beginning of the line so no
> > need to anchor.)
> >
> > WorkType:
> >
> > Now reads: ([^.]+(.[^.]+)
> >
> > Same problems as above with "^" and invoking of literal full stop.
> >
> > Is the intent of this expression the same as ReferenceType?
> >
> > In other words to:
> >
> > 1. Any legal namestart character, followed by,
> > 2. Any legal name character, followed by,
> > 3. literal "." character, followed by
> > 4. one or more groups of legal name characters separated by a
> literal "."
> >
> > if so, why would I want both of them? For that matter, the more I
> > think about it, I am not sure what function either one would serve, at
> > least in light of our not declaring a set of references to other works.
> >
> > Suggestion: Why not settle on an outside reference pointer that
> > subclasses xs:string the way we have for enumerated values on
> > attributes. You can at this point declare whatever other pointers you
> > like, but prepend "x-" to them? That would allow us to later (probably
> > by the Fall release of translator and publisher modules, to declare
> > references like compScriptureReferenceType that provide validation of
> > at least part of the reference?
> >
> > compScriptureReferenceType:
> >
> > Now reads (in part) ((...All Book Names...))((.[^.]+){0,}))?
> >
> > Same problems as above with "^" and invoking of literal full stop.
> >
> > In other words to:
> >
> > 1. Book Name, followed by
> > 2. literal "." character, followed by
> > 3. any digit or letter (one or more) (question, do we need letter for
> > some Bible references?), followed by
> > 4. literal "." character, followed by
> > 5. any digit or letter (one or more) (question, do we need letter for
> > some Bible references?), followed by (optional)
> >
> > If that is the case, would the following work?
> >
> > ((...All Book Names...))\.[A-Za-z0-9]*(\.[A-Za-z0-9]*)?
> >
> > Note that this expression requires book name plus chapter, could
> > someone want to just refer to Matthew?
>
> If I am completely off base on my reading of XML Schema regex
> expressions please point me to the correct information but it sounds
> like [^.]  which is in all the expressions I corrected, excludes all
> letters but newline? Fairly sure that is not what was intended.
>
> Recall that XML Schema regex expressions always, automatically, never do
> differently, bind at the beginning of the string, so "^" to match the
> beginning of a string is not required (Not to mention has a different
> meaning than the way it is used in Perl/sed/awk, etc.).
>
> Chime in now on these or any other issues because save for typos, I
> would like to consider this version a code freeze so tomorrow I can work
> on documentation to insert into the schema and Todd/Chris/Troy can start
> getting some sample texts together for a Monday release.
>
> I'm about to take a break for a couple of hours but will be back online
> later today and early this evening.
>
> Looks really good guys!
>
> Patrick
>
>
>
>
>
> >
> --
> Patrick Durusau
> Director of Research and Development
> Society of Biblical Literature
> pdurusau@emory.edu
>
>