[osis-core] DOM? PLEASE READ.

Troy A. Griffitts osis-core@bibletechnologieswg.org
Thu, 23 May 2002 18:06:56 -0700


Patrick,
	Thanks for the reply.  Appreciate your time, even while away!

>> I don't understand exactly why we're again trying to force a single tree
>> hierarchy on our document (I don't know why XML hasn't been forced to
>> make DOM inherit from a rename of the current DOM spec to TOM).  A
>> 'document' is almost NEVER a single hierarchy tree structure-- business
>> sales data- maybe, other database-like information- sure, but not
>> marked-up books! :)
>>
> Single tree hierarchy is inherent in SGML/XML markup. Don't know how to 
> put it any more plainly. When you say SGML/XML you have said "I am using 
> a single tree hierarchy for my document." All variant hierarchies 
> therefore are something of a "hack" to use your term in the single tree.

Right, my point I didn't make too well was that I believe (who am I) 
that current DOM definition is messed up for representing text 
documents.  Sorry; can I still be a member of osis-core?


>> At one point you all had me convinced that almost everything should be a
>> milestone.  Now you're trying to convince me that we should have NO
>> milestones.  I'm so distraught. :)
>>
> You were present when we discussed this change in Rome but let me try to 
> reconstruct some of the discussion. I think it was Eric who pointed out 
> that while they used milestones (a lot) in XSEM, that actually the times 
> when boundaries are crossed are actually quite few in comparison with 
> the number of times that they don't. Therefore, we had this elaborate 
> structure of opening and closing milestones to deal with a small number 
> of cases. Steve pointed out that using the prev/next solution from TEI 
> could easily be used with XSLT to generate the required presentation. 
> Eric suggested that we use the key/keyRef mechanism to validate the 
> references between the two parts (more on that follows). I thought it 
> was a fairly elegant solution and reduced the number of elements and 
> would be easier to teach to users for the small number of cases where it 
> is actually an issue.

Honestly-- thought it was probably my fault for sleeping, or zoning, or 
something-- I have no rememberance of any of this.  You sure it wasn't 
after Sunday afternoon?



>> I don't think a good argument is that it makes things a little easier
>> when using standard xml transformation tools.  REASON:  None of our real
>> users will use standard XML transformation tools for a whole Bible. 
>> They currently can't, unless they own a Cray with 5 gigs of memory.  And
>> even if they could, we're not helping simplify the problem; instead, all
>> we seem to be doing is making the simple case work easier, but adding
>> more complexity to the fully marked up cases.
>>
> None of our users write filters to transform hundreds of texts from text 
> format into various presentation formats either. ;-) Well, only a small 
> minority of them. ;-)

Actually, in my mind, our users ARE exactly these people.  They are Bob, 
and me, and the sfm-to-OSIS-converter-guy at Wycliffe, and anyone else 
writing software to deal with texts in this markup.  In fact, I 
specifically remember Bob backing and defending our decision to use 
milestones.


>> When we start adding modules like translator markup, analytical markup,
>> etc., we're gonna have total hacks all over to get around this "crossing
>> containment" problem.
>>
> Actually not, at least in my opinion. First of all, the crossing 
> containment situation is the minority (I would guess far less than 5% of 
> the cases, actually probably less than that.)

I'm trying to imagine the finished picture here-- once we start adding 
more modules.

Imagine a base text marked up simply, with osis core.  OK.
Then add my <w> tags for strongs numbers and morph.  Probably still OK.
Then add translators comments.
Then Kirk's linguistic annotation
Then publisher preferences (display 'hints'?)
What else are we gonna add?...

Think NON-XML here: with containment which can cross boundaries, this 
doesn't scare me.


> Second, the translator and 
> analytic markup cases will probably rely more upon pointing mechanism 
> since they layer onto the text as encoded, rather than being part of the 
> text layer itself. (Yes, commentary is part of the text (or tree) but as 
> you and Todd have often pointed out, they are not part of the "biblical 
> text" that we are concerned with encoding. I may be pointing to text 
> that crosses boundaries but that is a different issue and not one that 
> requires a "hack" to solve.)

External markup pointing into the base doc?  I understand your argument. 
  I think it might be a good workaround for the containment problem in 
XML.  I don't think anyone knows how or will use such a mechanism for 
quite some time, if not ever, if the XML spec finally get's adjusted or 
extended to support such content.


>> I don't like our new approach, currently, and want to be convinced
>> otherwise, if you guys will indulge my inquiry.
>>
> Assuming that we are going to finish an XML schema, I am not sure what 
> other approach reduces the complexity of our markup,

Well, the complexity issue isn't different with milestones.


> deals with the 
> minority of cases by requiring more markup only for those cases (a good 
> thing in my opinion,

Agreed, if indeed you still feel this is the minority of the cases in a 
fully marked up OSIS text.


> harder cases are the only ones that require more 
> work), and represents the artificial boundaries that are familiar to 
> readers and others that are desired by translators.

didn't understand 'artificial'?  I would think anything we want to allow 
an author to contain is a logical container.  Not sure your point here.


Thanks again for humoring me.  I understand it might be too late to even 
think about these things, but it seems kindof odd that our entire base 
approach of marking up has changed in the 3 weeks since the osis 
conference.  This is our foundation for the spec.  I'm afraid of on what 
we're building.

	-Troy.