[osis-core] Milestones: Reply to Troy's comments

Patrick Durusau osis-core@bibletechnologieswg.org
Wed, 22 May 2002 21:20:17 -0400


Troy,

Woke up way too early but decided to write a response and then go back 
to bed!

Troy A. Griffitts wrote:

>OK guy, I'm gonna give myself away here, but here are my sincere
>comments.
>
>
>I don't understand exactly why we're again trying to force a single tree
>hierarchy on our document (I don't know why XML hasn't been forced to
>make DOM inherit from a rename of the current DOM spec to TOM).  A
>'document' is almost NEVER a single hierarchy tree structure-- business
>sales data- maybe, other database-like information- sure, but not
>marked-up books! :)
>
Single tree hierarchy is inherent in SGML/XML markup. Don't know how to 
put it any more plainly. When you say SGML/XML you have said "I am using 
a single tree hierarchy for my document." All variant hierarchies 
therefore are something of a "hack" to use your term in the single tree.

>
>At one point you all had me convinced that almost everything should be a
>milestone.  Now you're trying to convince me that we should have NO
>milestones.  I'm so distraught. :)
>
You were present when we discussed this change in Rome but let me try to 
reconstruct some of the discussion. I think it was Eric who pointed out 
that while they used milestones (a lot) in XSEM, that actually the times 
when boundaries are crossed are actually quite few in comparison with 
the number of times that they don't. Therefore, we had this elaborate 
structure of opening and closing milestones to deal with a small number 
of cases. Steve pointed out that using the prev/next solution from TEI 
could easily be used with XSLT to generate the required presentation. 
Eric suggested that we use the key/keyRef mechanism to validate the 
references between the two parts (more on that follows). I thought it 
was a fairly elegant solution and reduced the number of elements and 
would be easier to teach to users for the small number of cases where it 
is actually an issue.

key/keyRef: The syntax as you see it is the result of the requirements 
of W3C schemas. In order to have key/keyRef, the key must NEVER be a NIL 
value. In other words, to have key/keyRef with next/prev attribute 
values, every verse would have to have a next attribute and its value 
specified. That sounded really burdensome to me so I tried to cheat an 
only require the ID on verses, something you would likely have anyway, 
and that would get you the key. We can make the prev attribute optional 
since you only want it for the boundary crossing cases.

That however, is an error on my part in constructing the syntax since 
you now can't tell when at an element if it has a "next" component. Bad 
error on my part.

So, I think we are at the point where we need to drop the cheap 
validation of key/keyRef, which is really a database mechanism, and go 
with the traditional next/prev as optional attributes on all likely 
crossing elements.

>
>I don't think a good argument is that it makes things a little easier
>when using standard xml transformation tools.  REASON:  None of our real
>users will use standard XML transformation tools for a whole Bible. 
>They currently can't, unless they own a Cray with 5 gigs of memory.  And
>even if they could, we're not helping simplify the problem; instead, all
>we seem to be doing is making the simple case work easier, but adding
>more complexity to the fully marked up cases.
>
None of our users write filters to transform hundreds of texts from text 
format into various presentation formats either. ;-) Well, only a small 
minority of them. ;-)

Not sure what you mean by the "simple case" in terms of markup. It is 
more difficult to represent things that are quite easy to do, embedding 
single quotes within double quotes for example in a passage that crosses 
multiple verses that are denoted by verse number to represent embedded 
quotations that cross verse boundaries. The difference is that you do 
that with fairly complex logic that your users never see, but the logic 
is still there, even if the users don't see it.

>
>When we start adding modules like translator markup, analytical markup,
>etc., we're gonna have total hacks all over to get around this "crossing
>containment" problem.
>
Actually not, at least in my opinion. First of all, the crossing 
containment situation is the minority (I would guess far less than 5% of 
the cases, actually probably less than that.) Second, the translator and 
analytic markup cases will probably rely more upon pointing mechanism 
since they layer onto the text as encoded, rather than being part of the 
text layer itself. (Yes, commentary is part of the text (or tree) but as 
you and Todd have often pointed out, they are not part of the "biblical 
text" that we are concerned with encoding. I may be pointing to text 
that crosses boundaries but that is a different issue and not one that 
requires a "hack" to solve.)

>
>I don't like our new approach, currently, and want to be convinced
>otherwise, if you guys will indulge my inquiry.
>
Assuming that we are going to finish an XML schema, I am not sure what 
other approach reduces the complexity of our markup, deals with the 
minority of cases by requiring more markup only for those cases (a good 
thing in my opinion, harder cases are the only ones that require more 
work), and represents the artificial boundaries that are familiar to 
readers and others that are desired by translators.

Suggestions at this point:

1. Keep the prev/next mechanism but properly done (to change my error).

2. Drop the cheap validation of key/keyRef since it requires the key to 
be a non-nill value, i.e., it has to appear, which will make our users 
have to enter values for all ases whether crossing boundaries or not.

Sorry I had to drop off to Europe again at this point in our process but 
I really think we need to go ahead and nail this one down and get it 
out. I will try to get as much done before I return to the States on 
Saturday.

Patrick


>
>	Thanks for all the research and hard work!
>
>		-Troy.
>

-- 
Patrick Durusau
Director of Research and Development
Society of Biblical Literature
pdurusau@emory.edu