[osis-core] OSIS editor

Patrick Durusau osis-core@bibletechnologieswg.org
Tue, 03 Dec 2002 06:22:50 -0500


Harry,

Harry Plantinga wrote:

>Some talk was batted about at the Toronto meeting
>on modifying OpenOffice.org Writer to act as an
>OSIS editor.  One issue is how you would "validate" 
>markup as it is entered with the Writer interface.
>
>I thought y'all might be interested in the following
>Microsoft tidbit as it pertains to this issues:
>
>http://www.microsoft.com/office/xdocs/default.asp
>
>Basically, it sounds to me like an XML editor that
>supports schemas and allows you to build "rich 
>forms" for data entry.  It will be a part of Microsoft 
>Office starting mid-2003.
>
Not much information is available but I suspect from the description 
that it relies upon fairly simple schemas to build the forms. In other 
words, once inside a paragraph (our <p>) I doubt rather seriously that 
you will have the choice of a <list>, <lg>, or other possible container 
element. Might but then the form would have to be able to re-configure 
itself based upon user choices. Not that difficult to imagine or 
implement but would be surprised if it has that level of customization. 
Most XML schemas for data entry are fairly straight forward, since data 
files tend to be rather simplistic. (At least when compared to documents.)

>
>It remains to be seen whether it will be as easy
>to use as a word processor for editing OSIS documents,
>but I'm not all that hopeful -- it'll probably be 
>a lot like XMetaL or XML Spy.  Though a debugged, 
>faster XML Spy could actually be a fairly decent
>environment for OSIS editing by casual users.
>
>But my suspicion is that to make OSIS editing truly
>easy and error-proof, it will take a custom 
>application with user-interface elements designed 
>especially for OSIS.
>
Actually I am planning on testing your proposition after I return from 
XML 2002 (too much to do between now and then) and would appreciate your 
comments (and everyone else's) on the following proposal:

I am simplying the initial problem by assuming that all the header 
information would be developed by an XML aware person and software, so I 
am beginning with elements inside <osisText>.

1. I will create styles in OpenOffice that map to the various OSIS 
elements. (I will post some material to their list to see if the styles 
available within another style can be restricted dynamically, but 
assuming not for the rest of these steps.)

2. I will take one of your texts and one from Sword and mark it up using 
the styles created in #1 twice, once with what I think is "valid" OSIS 
markup and one with deliberately "invalid" (Todd, no comments please!) 
OSIS markup. ;-)

3. To process the markup from the saved file format, I will write XSLT 
stylesheets to key on the styles that were imposed in OpenOffice.

4. Part of the stylesheet will be a function to construct things like 
attribute values like osisIDs from styled information in the text. For 
example,

<style="verse-cite">Gen.1.1</style><style="verse">When in the 
beginning...</verse>

would result in: <verse osisID="Gen.1.1">When in the beginning....</verse>

Whereas,

Gen.1.1<style="verse">When in the beginning...</verse>

would not transform because the required content for the osisID is not 
present and an error message would be returned, saying that the osisID 
was not found, along with  the portion of text that needs the osisID (to 
help locating it in the text).

Admittedly this is not dynamic validation while the user is working but 
it would allow a fairly gross level of markup to be imposed by very 
inexperienced users.

This would not help with word level annotation, but most of that should 
be done automatically in any event.

In the latter part of December I am actually going to be testing this 
technique for markup on a portion of the Chicago Assyrian Dictionary, 
which in some ways is more complex than OSIS documents to date but also 
more fixed in terms of structure.

Even if we can't limit the use of styles within styles, I suspect even 
before transformation into OSIS markup, we could actually write a script 
that enforces an order in which styles must be used. Hmmm, will have to 
think about that one. In other words, use container styles to denote 
what other styles may be found therein. Again, not dynamic error 
checking but easier for inexperienced users and good for gross level 
markup (the bulk of markup in a document).

Might buy us some acceptance to have a free, WYSIWYG encoding for OSIS 
documents while waiting on a true OSIS editor. Any thoughts on whether 
the limitation of styles by context would be a good way to provide a 
user interface for such an editor? Could have an active box with 
attributes that can take a value for that "style" (read element).

The latest round of papers and submissions and reviews should be over by 
December 18th (major presentation at the SBL) so look for postings on my 
experiments between then and the end of the year.

Would appreciate any comments or suggestions on this proposal before 
then as well. (References to other works would probably be something one 
has to add in on a per document basis, i.e., match this string, then 
output this osisRef, assuming the document had a fairly consistent style 
of citing other works. Or perhaps just so marked in the transformation 
for further processing in the next stage, i.e.,  the values are reported 
with XPath locations so you can create another stylesheet to go through 
and do all of those separately.)


Thoughts, comments?

Patrick

-- 
Patrick Durusau
Director of Research and Development
Society of Biblical Literature
pdurusau@emory.edu