[osis-core] OSIS work regex

Todd Tillinghast osis-core@bibletechnologieswg.org
Wed, 14 Aug 2002 17:55:49 -0600


The conclusion is that leading numerals are allowed AND that we will
retain the ":" AND that OSIS references and identifiers should NOT be
expected to be a valid XML ID.

If you don't agree with this statement please post a reply.

I am agree with it.  I would like to have had valid XML ID, but we must
move on and there seems to much for us to give up for the benefit we
would not realize.

Patrick, hope your paper came!

Todd
> 
> Guys,
> 
> I am very near exhaustion and Steve will be taking over tonight to
> shepard the final hours of discussion.
> 
> My laptop died and I am having problems trying to hurry through
> installation of validation services on the Linux box. (Will have
another
> go in the morning.)
> 
> Todd, we may have to fall back on you for not only good advice but
> validation as well! Sorry 'bout that!
> 
> On the leading numeral question, we addressed that in Rome and may not
> have communicated it very well but it was one of the reasons we moved
> away from osisIDs being XML IDs. The leading number constraint just
did
> not get us anything but conformance with unnecessary pain. I don't
> honestly remember why it was the rule in ISO 8879 but XML requiring it
> was just backwards compatibility, no real justification for it.
> 
> Users are familiar with 1Cor, etc. and there is not gain from forcing
a
> change.
> 
> Signing off now to get some rest. Back at it (hopefully with greater
> clarity) in the morning.
> 
> Patrick
> 
> Todd Tillinghast wrote:
> 
> >We had decided NOT to allow a leading numeral.  I believe that only
in
> >the recent incarnations of the schema, where we stated from scratch
with
> >the reg exp, have we dropped that no leading numeral requirement.
> >
> >The bigger question is will we keep the ":".  That will be much more
> >painful to remove than disallowing a leading numeral.
> >
> >Also I don't see a reason to not allow ideographs, etc..
> >
> >Todd
> >
> >>I was speaking hypothetically -- if we are going to try to
> >>conform to XML name character usage, that is what we would have
> >>to do.
> >>
> >>But, we've already decided not to conform -- we're allowing a number
> >>to start out osisIDs. So, I suggest we allow letters, digits, and _
> >>as start characters.
> >>
> >>Also, we seem to be ignoring the use of ideographs and accented
> >>characters in names. It's OK with me, but I want to make sure it's
> >>intentional.
> >>
> >>-Harry
> >>
> >>>-----Original Message-----
> >>>From: owner-osis-core@bibletechnologieswg.org
> >>>[mailto:owner-osis-core@bibletechnologieswg.org] On Behalf Of
> >>>Todd Tillinghast
> >>>Sent: Wednesday, August 14, 2002 4:35 PM
> >>>To: osis-core@bibletechnologieswg.org
> >>>Subject: RE: [osis-core] OSIS work regex
> >>>
> >>>
> >>>The statement below does not make since to me.  It seems you
> >>>are saying two conflicting things.  In any case, it seems
> >>>that you are saying that we should conform to the XML standard.
> >>>
> >>>I guess what I am suggesting is that we have references that
> >>>can be XML IDs.  I am not sure what all of the precluded and
> >>>allowed characters are.  I know that Patrick was much better
> >>>verse at this when we talked several months ago on this very topic.
> >>>
> >>>The trouble with this whole line of discussion is that ":",
> >>>"[", and "]" are not allowed in XML IDs!
> >>>
> >>>Also the test I did with an "_" leading was ok, it was the
> >>>leading number that was the problem we had before.
> >>>
> >>>SORRY FOR THE BOGUS DETOUR RELATED TO "_"!
> >>>
> >>>The issue still remains related to OSIS references and
> >>>identifiers as XML IDs.  I think that is why I was using ".."
> >>>rather than ":" long ago. If we trade the ":" for ".." and do
> >>>away with the "[" and "]" then we would be back with a valid
> >>>XML ID.  (Of course ALLOW "_" and preclude numeral as the
> >>>leading character.)
> >>>
> >>>Todd
> >>>
> >>>>My "XML in a Nutshell" reference book says that XML name start
> >>>>characters are letters, ideographs, and the underscore, _.
> >>>>
> >>>If we want
> >>>
> >>>>to conform to XML usage, we should allow ideographs,
> >>>>
> >>>underscore,
> >>>
> >>>>but no _ or digits in osisIDs, I guess.
> >>>>
> >>>>-Harry
> >>>>
> >>>>>-----Original Message-----
> >>>>>From: owner-osis-core@bibletechnologieswg.org
> >>>>>[mailto:owner-osis-core@bibletechnologieswg.org] On
> >>>>>
> >>>Behalf Of Todd
> >>>
> >>>>>Tillinghast
> >>>>>Sent: Wednesday, August 14, 2002 3:34 PM
> >>>>>To: osis-core@bibletechnologieswg.org
> >>>>>Subject: RE: [osis-core] OSIS work regex
> >>>>>
> >>>>>
> >>>>>I think I am clear now on the proposal.
> >>>>>
> >>>>>Although we don't intend to use our ids as XML IDs, by allowing
> >>>>>
> >a
> >
> >>>>>leading "_" we preclude others from using the same
> >>>>>
> >>>syntax/form and
> >>>
> >>>>>set of identifiers in other implementations. This weakens our
> >>>>>standard.
> >>>>>
> >>>>>I hope that encoders other than those encoding OSIS
> >>>>>
> >>>documents would
> >>>
> >>>>>use identifiers that are of the same "currency" as our
> >>>>>
> >references
> >
> >>>>>and identifiers.  By elimination the option for those
> >>>>>
> >>>identifiers to
> >>>
> >>>>>XML IDs we limit the possibility for wider adoption,
> >>>>>
> >>>influence and
> >>>
> >>>>>interoperability with OSIS document.
> >>>>>
> >>>>>Todd
> >>>>>
> >>>>>>Todd,
> >>>>>>
> >>>>>>I don't think Harry meant "_" as an extra delimiter (in the
> >>>>>>
> >>>>>same sense
> >>>>>
> >>>>>>as "." is a delimiter in our syntax but more as a name
> >>>>>>
> >character
> >
> >>>in
> >>>
> >>>>>>writing customary citations of names. It is in a sense a
> >>>>>>
> >>>>>delimiter but
> >>>>>
> >>>>>>as part of the name to be matched as a string and not a
> >>>>>>
> >>>delimiter.
> >>>
> >>>>>(Does
> >>>>>
> >>>>>>that make any sense at all? Perhaps Harry can state what he
> >>>>>>
> >>>>>meant more
> >>>>>
> >>>>>>clearly. ;-)
> >>>>>>
> >>>>>>Patrick
> >>>>>>
> >>>>>>Todd Tillinghast wrote:
> >>>>>>
> >>>>>>>What extra value does the "_" give us?
> >>>>>>>
> >>>>>>>Are you proposing Bible_.TEV_ ?
> >>>>>>>
> >>>>>>>Or just that "_" would be an option as in
> >>>>>>>Bible.Todd_New_And_Different_Reference_System ?
> >>>>>>>
> >>>>>>>I can see "_" as an allowable character as long as it
> >>>>>>>
> >>>is not the
> >>>
> >>>>>leading
> >>>>>
> >>>>>>>character but don't see any value in having it as an
> >>>>>>>
> >>>additional
> >>>
> >>>>>>>delimiter to ".".
> >>>>>>>
> >>>>>>>Todd
> >>>>>>>
> >>>>>>>>-----Original Message-----
> >>>>>>>>From: owner-osis-core@bibletechnologieswg.org
> >>>>>>>>
> >>>[mailto:owner-osis-
> >>>
> >>>>>>>>core@bibletechnologieswg.org] On Behalf Of Harry Plantinga
> >>>>>>>>Sent: Wednesday, August 14, 2002 7:26 AM
> >>>>>>>>To: osis-core@bibletechnologieswg.org
> >>>>>>>>Subject: RE: [osis-core] OSIS work regex
> >>>>>>>>
> >>>>>>>>If schema RegExps behave as they do in Perl, the ? is
> >>>>>>>>
> >>>>>superfluous.
> >>>>>
> >>>>>>>>Perhaps
> >>>>>>>>
> >>>>>>>> [\L\N][\.\L\N]*
> >>>>>>>>
> >>>>>>>>The underscore character (_) is pretty commonly used in
> >>>>>>>>
> >names
> >
> >>>and
> >>>
> >>>>>may
> >>>>>
> >>>>>>>be
> >>>>>>>
> >>>>>>>>present in documents converted to OSIS. I can't see that
> >>>>>>>>
> >>>>>it would do
> >>>>>
> >>>>>>>any
> >>>>>>>
> >>>>>>>>harm. Could it be included?  Perhaps
> >>>>>>>>
> >>>>>>>>[\L\N_][\.\L\N_]*
> >>>>>>>>
> >>>>>>>>-Harry
> >>>>>>>>
> >>>>>>>>----------------------------------
> >>>>>>>>For the work portion:
> >>>>>>>>
> >>>>>>>><xs:pattern value = "([\L\N\.]([\L\N\.]*)?)" />
> >>>>>>>>
> >>>>>>>>By which I am trying to say, any letter or number
> >>>>>>>>
> >combination,
> >
> >>>>>>>followed
> >>>>>>>
> >>>>>>>>by a period is complusory, followed by any number of
> >>>>>>>>
> >optional
> >
> >>>>>>>>letter/number combinations that also end in a period
> >>>>>>>>
> >(periods,
> >
> >>>>>>>hyphens,
> >>>>>>>
> >>>>>>>>etc., being excluded from the work name).
> >>>>>>>>
> >>>>>>--
> >>>>>>Patrick Durusau
> >>>>>>Director of Research and Development
> >>>>>>Society of Biblical Literature
> >>>>>>pdurusau@emory.edu
> >>>>>>
> >>>>>>
> >>>>>
> >>>
> >
> 
> --
> Patrick Durusau
> Director of Research and Development
> Society of Biblical Literature
> pdurusau@emory.edu
> 
>