[osis-core] osis_0109 Arrives! Very close to 1.0!!!

Patrick Durusau osis-core@bibletechnologieswg.org
Sun, 14 Apr 2002 09:51:20 -0400


Todd,

Todd Tillinghast wrote:

>>b. Can have multiple NMTOKENS, i.e., OSISID="Gen.17.17 Gen.17.18",
>>validated against the regex for referenceType.
>>
>This seems BAD to me.  This still does not handle the case were there truely
>is a different name explicitly assigned by the translator that has a
>different meaning than traditional reference systems use.
>
Hmmm,

Sure it does, can have "Gen.17.17 Gen.17.18 Pats.BadTrans.BigWhoop.35.26 
Todds.Perfect.System.23.23"  and for that particular encoding, you may 
even want to consider that an implied mapping?

>
>>3. OSISIDREF is now NMTOKEN (should only have a single ref back to a
>>starting point, not IDREF but probably not used that much anyway.)
>>
>>4. Now have refWork attribute on <text>. Validates against referenceWork
>>simpleType (which is redefined by osisScripture_0109.xsd (this gets you
>>the default prefix for your other references.)
>>
>>5. Note now has refWork attribute (so can point outside to a particular
>>referenceWork).
>>
>Why just in note?  I thing we should be able to use a non-default reference
>anywhere we are using a referenceType.  At least this should be possible in
><reference>, <figure>, and possibly a few key other elements.
>
Hmmm, could  the optional expression of refWork eliminate the need for 
the refWork attribute? In other words, all refs need not have the 
Bible.KJV.. prefix but where it does appear,  the reference has been 
qualified to appear in that system?

>
>>6. Regex no longer has Bible.KJV, etc., handled by referenceWork
>>
>I like the old way better see #5 above.
>

See reply above.


Patrick

>
>>7. Completely reformed regex expressions, recall from my earlier post:
>>
>This fixes the totally incorrect previous version!  It seeme that we should
>preclude some characters that are allowed by "/c".  (".", "-", and possibly
>a few others.)  This will reserve the right to use them later.
>Also I'm not sure we should have a "default" regex at all in OSISCore since
>it gets ORed with the regexs in the redefined versions and negates the
>opportunity for real validation.
>
>>>1. Regexs:
>>>
>>>Generally see: http://www.w3.org/TR/xmlschema-2/#regexs
>>>
>>>ReferenceType
>>>
>>>Now reads: ([^.]+)((.[^.]+){0,})?
>>>
>>>Note that "^" begins a negative character group.
>>>
>>>Note that the "." character in XML Schema is the equivalent of:
>>>[^\n\r] : any character except newline
>>>
>>>So, [^.] means only newline (excludes all other characters)
>>>
>>>Or more formally from the standard:
>>>
>>>[Definition:]   A * negative character group* is a ·positive character
>>>group· <http://www.w3.org/TR/xmlschema-2/#dt-poschargroup> preceded by
>>>the |^| character. For all ·positive character group·
>>><http://www.w3.org/TR/xmlschema-2/#dt-poschargroup> s /P /, ^/ P/ is a
>>>valid *negative character group*, and / C(^P)/ contains all XML
>>>characters that are /not/ in /C(P)/ .
>>>
>>>*Negative Character Group*
>>>|[15]   | | negCharGroup| |   ::=   | |'^' posCharGroup
>>><http://www.w3.org/TR/xmlschema-2/#nt-posCharGroup> |
>>>
>>>
>>>I assume the intent of the expression is:
>>>
>>>1. Any legal namestart character, followed by,
>>>2. Any legal name character, followed by,
>>>3. literal "." character, followed by
>>>4. one or more groups of legal name characters separated by a literal
>>>
>"."
>
>>>If that is the case, I would suggest that we re-write ReferenceType to
>>>read:
>>>
>>>([\i]([\c])*\.((\c)*\.)?
>>>
>>>Note that \i = any legal initial name character, \c = an y legal name
>>>character, \. = literal "." or full stop
>>>
>>>Additionally, since we have compScriptureReferenceType (I treat that
>>>regex below) not sure what ReferenceType is getting us in terms of
>>>validation? Structure of the references? Perhaps, would welcome some
>>>discussion on this and WorkType (next).
>>>
>>>(BTW, schema regexs always match from the beginning of the line so no
>>>need to anchor.)
>>>
>>>WorkType:
>>>
>>>Now reads: ([^.]+(.[^.]+)
>>>
>>>Same problems as above with "^" and invoking of literal full stop.
>>>
>>>Is the intent of this expression the same as ReferenceType?
>>>
>>>In other words to:
>>>
>>>1. Any legal namestart character, followed by,
>>>2. Any legal name character, followed by,
>>>3. literal "." character, followed by
>>>4. one or more groups of legal name characters separated by a literal
>>>
>"."
>
>>>if so, why would I want both of them? For that matter, the more I
>>>think about it, I am not sure what function either one would serve, at
>>>least in light of our not declaring a set of references to other works.
>>>
>>>Suggestion: Why not settle on an outside reference pointer that
>>>subclasses xs:string the way we have for enumerated values on
>>>attributes. You can at this point declare whatever other pointers you
>>>like, but prepend "x-" to them? That would allow us to later (probably
>>>by the Fall release of translator and publisher modules, to declare
>>>references like compScriptureReferenceType that provide validation of
>>>at least part of the reference?
>>>
>>>compScriptureReferenceType:
>>>
>>>Now reads (in part) ((...All Book Names...))((.[^.]+){0,}))?
>>>
>>>Same problems as above with "^" and invoking of literal full stop.
>>>
>>>In other words to:
>>>
>>>1. Book Name, followed by
>>>2. literal "." character, followed by
>>>3. any digit or letter (one or more) (question, do we need letter for
>>>some Bible references?), followed by
>>>4. literal "." character, followed by
>>>5. any digit or letter (one or more) (question, do we need letter for
>>>some Bible references?), followed by (optional)
>>>
>>>If that is the case, would the following work?
>>>
>>>((...All Book Names...))\.[A-Za-z0-9]*(\.[A-Za-z0-9]*)?
>>>
>>>Note that this expression requires book name plus chapter, could
>>>someone want to just refer to Matthew?
>>>
>>If I am completely off base on my reading of XML Schema regex
>>expressions please point me to the correct information but it sounds
>>like [^.]  which is in all the expressions I corrected, excludes all
>>letters but newline? Fairly sure that is not what was intended.
>>
>>Recall that XML Schema regex expressions always, automatically, never do
>>differently, bind at the beginning of the string, so "^" to match the
>>beginning of a string is not required (Not to mention has a different
>>meaning than the way it is used in Perl/sed/awk, etc.).
>>
>>Chime in now on these or any other issues because save for typos, I
>>would like to consider this version a code freeze so tomorrow I can work
>>on documentation to insert into the schema and Todd/Chris/Troy can start
>>getting some sample texts together for a Monday release.
>>
>>I'm about to take a break for a couple of hours but will be back online
>>later today and early this evening.
>>
>>Looks really good guys!
>>
>>Patrick
>>
>>
>>
>>
>>
>>--
>>Patrick Durusau
>>Director of Research and Development
>>Society of Biblical Literature
>>pdurusau@emory.edu
>>
>>

-- 
Patrick Durusau
Director of Research and Development
Society of Biblical Literature
pdurusau@emory.edu