[osis-core] OSIS work regex

Todd Tillinghast osis-core@bibletechnologieswg.org
Wed, 14 Aug 2002 14:35:02 -0600


The statement below does not make since to me.  It seems you are saying
two conflicting things.  In any case, it seems that you are saying that
we should conform to the XML standard.  

I guess what I am suggesting is that we have references that can be XML
IDs.  I am not sure what all of the precluded and allowed characters
are.  I know that Patrick was much better verse at this when we talked
several months ago on this very topic.

The trouble with this whole line of discussion is that ":", "[", and "]"
are not allowed in XML IDs!

Also the test I did with an "_" leading was ok, it was the leading
number that was the problem we had before.  

SORRY FOR THE BOGUS DETOUR RELATED TO "_"!

The issue still remains related to OSIS references and identifiers as
XML IDs.  I think that is why I was using ".." rather than ":" long ago.
If we trade the ":" for ".." and do away with the "[" and "]" then we
would be back with a valid XML ID.  (Of course ALLOW "_" and preclude
numeral as the leading character.)

Todd

> My "XML in a Nutshell" reference book says that XML name start
> characters are letters, ideographs, and the underscore, _. If
> we want to conform to XML usage, we should allow ideographs,
underscore,
> but no _ or digits in osisIDs, I guess.
> 
> -Harry
> 
> > -----Original Message-----
> > From: owner-osis-core@bibletechnologieswg.org
> > [mailto:owner-osis-core@bibletechnologieswg.org] On Behalf Of
> > Todd Tillinghast
> > Sent: Wednesday, August 14, 2002 3:34 PM
> > To: osis-core@bibletechnologieswg.org
> > Subject: RE: [osis-core] OSIS work regex
> >
> >
> > I think I am clear now on the proposal.
> >
> > Although we don't intend to use our ids as XML IDs, by
> > allowing a leading "_" we preclude others from using the same
> > syntax/form and set of identifiers in other implementations.
> > This weakens our standard.
> >
> > I hope that encoders other than those encoding OSIS documents
> > would use identifiers that are of the same "currency" as our
> > references and identifiers.  By elimination the option for
> > those identifiers to XML IDs we limit the possibility for
> > wider adoption, influence and interoperability with OSIS document.
> >
> > Todd
> >
> > >
> > > Todd,
> > >
> > > I don't think Harry meant "_" as an extra delimiter (in the
> > same sense
> > > as "." is a delimiter in our syntax but more as a name character
in
> > > writing customary citations of names. It is in a sense a
> > delimiter but
> > > as part of the name to be matched as a string and not a delimiter.
> > (Does
> > > that make any sense at all? Perhaps Harry can state what he
> > meant more
> > > clearly. ;-)
> > >
> > > Patrick
> > >
> > > Todd Tillinghast wrote:
> > >
> > > >What extra value does the "_" give us?
> > > >
> > > >Are you proposing Bible_.TEV_ ?
> > > >
> > > >Or just that "_" would be an option as in
> > > >Bible.Todd_New_And_Different_Reference_System ?
> > > >
> > > >I can see "_" as an allowable character as long as it is not the
> > leading
> > > >character but don't see any value in having it as an additional
> > > >delimiter to ".".
> > > >
> > > >Todd
> > > >
> > > >>-----Original Message-----
> > > >>From: owner-osis-core@bibletechnologieswg.org
[mailto:owner-osis-
> > > >>core@bibletechnologieswg.org] On Behalf Of Harry Plantinga
> > > >>Sent: Wednesday, August 14, 2002 7:26 AM
> > > >>To: osis-core@bibletechnologieswg.org
> > > >>Subject: RE: [osis-core] OSIS work regex
> > > >>
> > > >>If schema RegExps behave as they do in Perl, the ? is
> > superfluous.
> > > >>Perhaps
> > > >>
> > > >>  [\L\N][\.\L\N]*
> > > >>
> > > >>The underscore character (_) is pretty commonly used in names
and
> > may
> > > >>
> > > >be
> > > >
> > > >>present in documents converted to OSIS. I can't see that
> > it would do
> > > >>
> > > >any
> > > >
> > > >>harm. Could it be included?  Perhaps
> > > >>
> > > >> [\L\N_][\.\L\N_]*
> > > >>
> > > >>-Harry
> > > >>
> > > >>----------------------------------
> > > >>For the work portion:
> > > >>
> > > >><xs:pattern value = "([\L\N\.]([\L\N\.]*)?)" />
> > > >>
> > > >>By which I am trying to say, any letter or number combination,
> > > >>
> > > >followed
> > > >
> > > >>by a period is complusory, followed by any number of optional
> > > >>letter/number combinations that also end in a period (periods,
> > > >>
> > > >hyphens,
> > > >
> > > >>etc., being excluded from the work name).
> > > >>
> > > >
> > >
> > > --
> > > Patrick Durusau
> > > Director of Research and Development
> > > Society of Biblical Literature
> > > pdurusau@emory.edu
> > >
> > >
> >
> >