[sword-devel] XML idea: modular spec

Mike Sangrey sword-devel@crosswire.org
Fri, 31 Aug 2001 11:05:41 -0400


Dave, you've got some excellent ideas here.

Let me restate in my own words what I hear you saying and mix in some
of my own thinking.  Perhaps this will clarify (for me if no one else)
and maybe even generate some fill-in-the-blank thoughts.

I picture a set of concentric circles.  The core of which is the raw
data.  Representing this core as XML it would look almost as simple
as:

<word>
  foo
</word>
...
...

Pretty simple; pretty dumb.  Your 3 dimensionality problem, I think,
clearly underscores the importance of keeping this layer quite simple.
The data could easily be represented in DB tables.  It is a known
problem among linguistic researchers that any structure applied to the
data precludes certain results and pre-determines research in certain
directions.  Your `slicing' analogy presents this picture quite well,
IMO.  Pre-determining the result of the slicing is not what we want.

Also, this information could be in an XML file, or in an Oracle DB, or
whatever, doesn't matter.  Three-by-five cards would be rough. <smile>

The second concentric layer gives the semantic model which defines the
structure of the data.  In XML, this may look like:

<para>
  <sentence type="question">
    foo, foo, and more foo
  </sentence>
</para>

Or, maybe each word is tagged in some complex way (`<verb
tense="present">').  Whatever!  The important thing is that something
takes the raw data of the core and transforms it into a slice of that
three dimensional reality. This is why I liked Patrick Durusau
<pdurusau@emory.edu> comments so well.  The model he (and Matthew
O'Donnell) presented divides the problem, IMO, in the right place
(thus making two smaller problems which can be solved more easily).

I picture this layer NOT as a function of the data as much as a
process applied TO the data.  For example, if XSLT (with XPath and
XLink and all the goodies) were the macro language then a person (a
trained one) could sit at their desktop and through this macro
language tell a server (which could be anywhere) "I want the data
delivered to me with this specific semantic structure."  In other
words, the end user could be given the power to structure the data as
he or she wants it.  IMO, the utility of this is enormous (think:
researchers collaborating by sharing HOW they structure the data).

Also, this layer would not necessarily have to be performed on the
fly.  It could be pre-generated.

The next layer would be the processes which happen on the desktop.
This could also be XSLT with XPath and XLink and all the goodies, but
is a lot heavier on the FO (formatting) end of XSLT then just the
transformation aspect.  Maybe this is simple slap it up on the screen
or maybe it is a complex piece of analytically software or maybe it's
a translator's tool, or ...

The last layer is the GUI.

For you programmer types, this should look very much like the
three-tiered model--data, business rules, end-user application;
where data is in the DB, business rules live on the server, and
the end-user app. lives on the desktop.

O! The "business logic" (or semantic model of the structure) would not
absolutely HAVE to be on a server; it could be done on the desktop.
However, if it is on a server, then we can leverage collaboration
among people.  Think as wild as a white-board Bible study or as "low"
tech as an email one or even as input into a lithographic process for
just-in-time publication of Bible study helps.  It could be a "canned"
Bible study for a "lady's tea" or research as sophisticated as
linguists from SIL and Roehampton discussing discourse features which
influence paragraph boundaries.  It could be...

Well, Dave, or anyone else, whaddaya think?

-- 
Mike Sangrey
msangrey@BlueFeltHat.org
Landisburg, Pa.
                        "The first one last wins."
            "A net of highly cohesive details reveals the truth."