[osis-core] empty tag / milestone proposal

Steve DeRose osis-core@bibletechnologieswg.org
Tue, 25 Jun 2002 11:51:56 -0400


At 02:58 PM -0400 06/20/02, Harry Plantinga wrote:
>One good reason for marking up quotes with <q>...</q>
>is that different symbols are used for opening and
>closing quotes in different languages. If I want to
>render a Greek NT for Spanish or English readers, I
>might do quotes different ways. (Why I'm not just using
>Greek quotes I don't know.)
>
>Using &quot; specifies a particular glyph, and changing
>that to something else seems messy.

I agree; I think it's also illegal to redefine any of the 5 
predefined XML entities.

>
>Another benefit of the <q mStart=""/> <q mEnd=""/> approach
>is that OSIS-level-2-conformant (or whatever, as Steve
>suggested) software WOULD be able to treat the quote as
>a container.
>
>Maybe this can be a general rule. In OSIS-level-1 conformant
>software <element mStart=""/> has the semantics of an
>empty tag; in OSIS-level-2 conformant software it's a container.
>
>So here's a general proposal.
>
>We offer the current DTD. The semantics are that in OSIS-level-1
>conformant processing software, empty tags are empty tags. In
>OSIS-level-2 conformant processing software, milestones and empty
><verse> <q> etc tags can be container start and end tags.

I think that's worth serious attention -- we would of course also 
want true empty tags to still be possible, so we should have a 
principled way to tell the milestone-pair cases from the true 
empties; either a tag-naming convention, some required attributes (I 
guess mstart/mend/neither would do it), etc. Oh, we'd also want some 
rule for how to do discontiguous ones, too, for completeness (no 
point building a whole layer of conventions and software requirements 
for half the job, and just have to add another later.

I guess next/prev would do it; true empties would always have 
neither, the first milestone would only have next, the last only 
prev, and ones in the middle would have both; but unless we added 
something more, you could never tell which ones were starts and which 
ends without counting all the way to one end or the other (and 
editing errors could produce ugliness like all the starts/ends 
swapping roles -- gak!). So maybe something like next, prev, and 
milestone-role = (start, end, restart, reend, none)?

XSLT that would transform milestones into segments and back would be 
awfully nice. Perhaps something we could try to scare up funding for 
at some point.

>
>-Harry
>
>
>>  -----Original Message-----
>>  From: owner-osis-core@bibletechnologieswg.org
>>  [mailto:owner-osis-core@bibletechnologieswg.org]On Behalf Of Patrick
>>  Durusau
>>  Sent: Thursday, June 20, 2002 2:02 PM
>>  To: osis-core@bibletechnologieswg.org
>>  Subject: Re: [osis-core] empty tag / milestone proposal
>>
>>
>>  Harry,
>>
>>  Harry Plantinga wrote:
>>
>>  >Hey, here's an idea that will eliminate a large percentage of
>>  >the hierarchy overlap problems that have been identified so far.
>>  >Don't use <q>, just use ".
>>  >
>>  >Just kidding.  The real proposal is to not treat <q> as a container.
>>  >(Does one ever really need it to be a container?)  Use <qstart>
>>  >and <qend>.  Or use <q mStart=""/> <q mEnd=""/> but don't require
>>  >processing software to treat that as a container.  Just use it
>>  >to put in the appropriate " symbols.
>>  >
>>  Realize you were kidding about " but why not suggest the entities &quot;
>>  and &apos; which are pre-defined for XML? If what you are seriously
>>  suggesting is markup to stand in for symbols, <q mStart=" "/>, the
>>  entity route gets you there without tormenting markup. ;-)
>>
>>  Patrick
>>
>>
>>  >
>>  >-whp
>>  >
>>  >>-----Original Message-----
>>  >>From: owner-osis-core@bibletechnologieswg.org
>>  >>[mailto:owner-osis-core@bibletechnologieswg.org]On Behalf Of Steve
>>  >>DeRose
>>  >>Sent: Wednesday, June 19, 2002 10:06 PM
>>  >>To: osis-core@bibletechnologieswg.org
>>  >>Subject: RE: [osis-core] empty tag / milestone proposal
>>  >>
>>  >>
>>  >>Like Harry, I'm torn over this (and want to go to bed).
>>  >>
>>  >>At least the number of choices is small. It seems like we're down to
>  > >>
>>  >>a) use segments
>>  >>
>>  >>b) use milestones
>>  >>
>>  >>c) allow both
>>  >>
>>  >>Troy and Harry have described the tradeoffs really well, IMHO.
>>  >>
>>  >>The usual TEI approach in such case was to allow both. This has
>>  >>advantages similar to those of many Vatican II pronouncements:
>>  >>everybody feels they got what they wanted; and disadvantages likewise
>>  >>similar: nobody really ended up compatible.
>>  >>
>>  >>I think we need to either prohibit or explicitly allow the use of
>>  >>empty forms. Although Patrick is right that you can always dump in
>>  >>empty elements for the start and end, the semantics implied by that
>>  >>syntax are not what we want.
>>  >>
>>  >>    <q mStart=""/>some quoted text<q mEnd=""/>
>  > >>
>>  >>means 3 siblings, 2 being empty quotations. That's reeally not the
>>  >>same meaning as
>>  >>
>>  >>    <q>some quoted text</q>
>>  >>
>>  >>(Eudora's spellchecker kindly underscores the tags for me, thus
>>  >>making those q's look an awful lot like g's).
>>  >>
>>  >>As someone pointed out, it's not the same DOM tree, and
>>  >>structure-aware tools such as CSS and XSLT don't have any way to deal
>>  >>with it (that one worries me considerably, since people commonly
>>  >>judge by appearances, and our appearances would be handicapped in
>>  >>most systems).
>>  >>
>>  >>Thus, although people could encode quotes with pairs of empties,
>>  >>their data would fail to "work" in typical software.
>>  >>
>>  >>Mainly for that reason, I think I'm inclined to a solution such as:
>>  >>
>>  >>a) permit only segmentation in core, but document clearly how it gets
>>  >>messy (explosively messy) as the amount of overlap increases.
>>  >>
>>  >>b) create a specific module for heavy annotation, that adds the
>>  >>mStart/mEnd construct for a lot of element types, that defines the
>>  >>semantics intended, and that discusses the issues involved. Make
>>  >>support of this module a separate conformance level, and require that
>>  >>systems specify whether they support it or not.
>>  >>
>>  >>To paraphrase Zoot: Oooh, it's not a very good solution, is it? But
>>  >>we are nice, and will see to your every markup need.
>>  >>
>>  >>--
>>  >>
>>  >>Steve DeRose -- http://www.stg.brown.edu/~sjd
>>  >>Chair, Bible Technologies Group -- http://www.bibletechnologies.net
>>  >>Email: sderose@speakeasy.net
>>  >>Backup email: sderose@mac.com, sjd@stg.brown.edu
>>  >>
>>
>>  --
>>  Patrick Durusau
>>  Director of Research and Development
>>  Society of Biblical Literature
>>  pdurusau@emory.edu
>>
>>
>>


-- 

Steve DeRose -- http://www.stg.brown.edu/~sjd
Chair, Bible Technologies Group -- http://www.bibletechnologies.net
Email: sderose@speakeasy.net
Backup email: sderose@mac.com, sjd@stg.brown.edu