[osis-core] OSIS editor

Harry Plantinga osis-core@bibletechnologieswg.org
Tue, 3 Dec 2002 10:03:07 -0500


Patrick,

I've done something similar with Microsoft Word for ThML, and
I've had usable results. The problem I have had is that users who
are not sufficiently constrained enter lots of bad "markup" and
the hand-validating (or save validation process) takes a long time.
However, I didn't make full use of Word facilities like auto Table
of Contents and embedded fields for <pb>s; these would have further 
reduced the opportunity for user error.

One point:  word processors like Word and presumably Writer 
allow three levels of markup:

<div>
  <p>
    <span>
    </span>
  </p>
</div>

So, map as much OSIS as you can onto 3 levels of hierarchy and
you'll have a good idea of the hierarchy that Writer will be 
able to handle with proper nesting guaranteed. Map the rest of 
OSIS onto milestone-like elements.

Another point: Word and presumably Writer allows you to add 
other milestone-like markup in the form of fields (or notes in
Writer), and this can be used for any OSIS markup that doesn't
fit into the 3-level hierarchy. There are user interface elements
for some of this milestone-like markup and generic facilities
for the rest. E.g. for <div> elements, use the table of contents 
facility.  For citations, use the bibliography reference facility. 
For other markup insert generic milestone-like elements.

Fields or milestone-like elements also can be used for attributes
of tags -- e.g. a milestone of type "attribute" has attributes for
the paragraph, character, or section element that contains it.

Writer allows you to create forms that restrict data entry. These
could be used for typing in basic header info.

You could handle split elements by using "continuation" styles,
e.g. a paragraph style called "verse" and another called
"verseCont".

You could map

<div> --> section
<p>, <v>, <l>, <list>, etc --> paragraph style
<w>, <seg>, <reference>, <a>, etc --> character style

Finally, it will be possible to do much of the handling of 
milestones automatically using macros and buttons. I assume you
won't be writing lots of macros for your experiment in December, 
but having buttons for inserting <verse> <osisID> <osisRef> and 
whatnot will eventually take away much of the opportunity for 
entry error.

Hmmmm, sounds fun...

-harry

----------------------------------------------------------

I think if I were going to work on this I would 

1.  print out the OpenOffice.org schema and Osis1.1.1 and make
a mapping for all elements, both ways.

2.  write osis-to-oo.xslt and oo-to-osis.xslt

3.  define paragraph and character styles, macros, etc. needed
to enter appropriate stuff in OpenOffice.

-whp




> 
> Whereas,
> 
> Gen.1.1<style="verse">When in the beginning...</verse>
> 
> would not transform because the required content for the 
> osisID is not 
> present and an error message would be returned, saying that 
> the osisID 
> was not found, along with  the portion of text that needs the 
> osisID (to 
> help locating it in the text).
> 
> Admittedly this is not dynamic validation while the user is 
> working but 
> it would allow a fairly gross level of markup to be imposed by very 
> inexperienced users.
> 
> This would not help with word level annotation, but most of 
> that should 
> be done automatically in any event.
> 
> In the latter part of December I am actually going to be testing this 
> technique for markup on a portion of the Chicago Assyrian Dictionary, 
> which in some ways is more complex than OSIS documents to 
> date but also 
> more fixed in terms of structure.
> 
> Even if we can't limit the use of styles within styles, I 
> suspect even 
> before transformation into OSIS markup, we could actually 
> write a script 
> that enforces an order in which styles must be used. Hmmm, 
> will have to 
> think about that one. In other words, use container styles to denote 
> what other styles may be found therein. Again, not dynamic error 
> checking but easier for inexperienced users and good for gross level 
> markup (the bulk of markup in a document).
> 
> Might buy us some acceptance to have a free, WYSIWYG encoding 
> for OSIS 
> documents while waiting on a true OSIS editor. Any thoughts 
> on whether 
> the limitation of styles by context would be a good way to provide a 
> user interface for such an editor? Could have an active box with 
> attributes that can take a value for that "style" (read element).
> 
> The latest round of papers and submissions and reviews should 
> be over by 
> December 18th (major presentation at the SBL) so look for 
> postings on my 
> experiments between then and the end of the year.
> 
> Would appreciate any comments or suggestions on this proposal before 
> then as well. (References to other works would probably be 
> something one 
> has to add in on a per document basis, i.e., match this string, then 
> output this osisRef, assuming the document had a fairly 
> consistent style 
> of citing other works. Or perhaps just so marked in the 
> transformation 
> for further processing in the next stage, i.e.,  the values 
> are reported 
> with XPath locations so you can create another stylesheet to 
> go through 
> and do all of those separately.)
> 
> 
> Thoughts, comments?
> 
> Patrick
> 
> -- 
> Patrick Durusau
> Director of Research and Development
> Society of Biblical Literature
> pdurusau@emory.edu
> 
> 
>