[osis-core] On using OpenOffice as an OSIS editor

Harry Plantinga osis-core@bibletechnologieswg.org
Mon, 17 Jun 2002 15:27:18 -0400


For convenience and ease of use, it would be best to be able 
to load and save documents in OSIS format from within Writer
using the normal load/save facilities. I searched on the 
OpenOffice web site, and it's possible to write load and save 
filters without modifying OpenOffice source code or linking 
against the OpenOffice Core.  Thus, you can write load and 
save filters that continue to work even as OpenOffice 
versions change. 

Presumably you'd want to read and write a collection of files: 
an OSIS XML text, an XSLT stylesheet (to convert to HTML, move
footnotes to the end of the page, etc), a CSS stylesheet (to 
view the OSIS document in a web browser in all its formatted 
glory), and linked objects such as images. You could store all 
the files except the main XML file in a directory.

Buttons and macros that can be used inside Writer to help with 
entering and checking text could come later; if anyone really
wants to jump-start work on an OSIS editor, OSIS load and
save filters for Writer would be a great place to start.

-harry 

-----Original Message-----
From: owner-osis-core@bibletechnologieswg.org
[mailto:owner-osis-core@bibletechnologieswg.org]On Behalf Of Patrick
Durusau
Sent: Friday, June 14, 2002 7:23 AM
To: osis-core@bibletechnologieswg.org
Subject: Re: [osis-core] On using OpenOffice as an OSIS editor


Harry,

Just a brief and inadequte reply to your post on OpenOffice. ;-)

I have been using it for several weeks and while sometimes slow, seems 
fairly stable.

Not certain that we would have to use macros. The issue has arisen in 
TEI land (again!) of how to get users better tools for entering markup. 
One suggestion has been to use OpenOffice (with styles) and an XSLT 
stylesheet to convert the underlying XML from OpenOffice XML into TEI. 
(This originated in a discussion between Sebastian Rahtz and myself over 
writing an export filter for OpenOffice. Since OpenOffice has a native 
XML format, XSLT would be a simple way to test the difficulty of going 
from OpenOffice XML to TEI, without the overhead of writing the filter.)

If I get some time this weekend, I may try to input a chapter or so, 
probably the Matthew chapter that has been the subject of so much 
discussion, to see what sort of XML we would get from OpenOffice with no 
tweaking. Might be a good measure of how much trouble we would encounter 
with such an approach.

Thanks!

Patrick

Harry Plantinga wrote:

>Preface: I've thought for years how to make ThML easy
>for a non-XML-user to edit and I haven't yet come up 
>with a solution that gets the documents all the way
>to the valid XML stage. I've tried using Word as an
>editor with a custom stylesheet and macros, and that's
>about the best solution I've had, but it leaves quite
>a bit of work for an expert to correct markup, validate
>the document, convert to XML, etc. Often several hours
>per document. 
>
>I'd more or less given up on Word because I want the
>resulting documents to be valid XML, not requiring 
>additional work. (Requiring an XML expert to finish up
>documents is a major bottleneck in the pipeline, to mix
>metaphores slightly.) The obvious approach is an XML 
>editor, and this summer I'm experimenting with XMetaL.  
>
>In theory it is a very nice approach. You can edit in
>a view that looks as wysiwyg as CSS can make it. 
>You can write Word import macros and save in HTML
>or PDF as well as XML. You can preview in a browser
>with XSLT and CSS styling.  You can add macros, buttons,
>and the like to the user interface.
>
>In practice, it's working out reasonably well. The main
>gotchas are that the software is poorly documented in some
>cases, slow, buggy, and possibly in flux (Corel recently
>bought SoftQuad). Oh, and it costs hundreds of dollars
>and runs only on Windows. 
>
>Reading up on the archive for this list, I came across
>teh discussion about using OpenOffice, and I thought I'd
>give it another look. (Last time I checked, it couldn't 
>print, etc.)  I expected to report that it wouldn't be
>appropriate without extensive source code hacking, for
>the same reason that Word isn't great: the content model
>is pretty flat and basic, making it hard to use to validate
>more complex content models. 
>
>============= summary ========================
>
>I came away from my exploration thinking that one could
>do a pretty decent job of an OSIS editor with fairly
>extensive macro programming but no source-code hacking. 
>Maybe a few months' effort. There are sufficient UI 
>interface elements to do a decent to good job, but not 
>great: I doubt it will be possible to prevent illegal 
>structure entry. It'll require a "validate" button and a 
>validation process to correct errors before the document 
>can be saved in OSIS format.
>
>=========== about OpenOffice ==========================
>
>OpenOffice has several modules: word processor, spreadsheet,
>drawing program, presentation program, etc. All use XML
>as their native file format. The suite has recently been
>released in Version 1.0.  It's quite a full featured 
>near-clone of Microsoft Office, and it works quite well.
>There are still lots of little gotchas in reading or saving
>Microsoft Office documents though. OpenOffice is free, open,
>and available for Windows, Mac, Linux, Unix, etc.  Download
>from www.openoffice.org
>
>
>=========== about OpenOffice's text DTD module =========
>
>The openoffice DTD has many modules. One, called text, is
>the primary one for the word processor, though it doesn't
>contain the table elements.  It has 181 elements, 
>including 84 with content model PCDATA and 38 EMPTY. The
>main structure is that sections contain paragraph-level
>elements (p, h, lists, tables, indexes, etc.).  Paragraph-
>level elements contain inline elements (PCDATA, span, tabstop,
>bookmark, drawing, a, set-page-variable, reference-mark-start,
>footnote-ref, etc. etc.).
>
>It doesn't have a nice mapping to OSIS, but it may be possible
>to "fake it" as described below.
>
>========= Proposal for editing OSIS with OpenOffice =========
>
>osisText/header:
>  - store information in predefined openOffice elements or 
>    an openOffice element field element of type user-defined.
>  - make an openOffice form to enter the data in the document.
>
>OSIS front, body, back
>  - use OpenOffice section elements
>
>OSIS divs
>  - use outlining facility of OpenOffice. Specifically, use the 
>    <h> element, which has a numeric level attribute.  
>  - each heading is the start of a new div, with the heading
>    level giving the nesting depth of the div
>  - each div ends at the next heading paragraph
>  - text of the heading could be used as the divTitle
>  - maybe display the heading in reverse video to show that 
>    the text of the heading is not part of the document flow
>  - bonus: OpenOffice "outline view" would show the div structure
>    of the document.
>
>OSIS linegroups
>  - the list facility appears to have sufficient sturcture
>    to handle lines and linegroups
>
>Verses split across paragraph boundaries
>  - select the text of the verse and click a "verse" button. A 
>    macro could prompt for verse identifier and add prev and 
>    next attributes to span any paragraph boundaries.
>
>Loading, saving
>  - a macro or plug-in could read and save OSIS documents. 
>  - bonus: importing Word documents, saving HTML and PDF.
>
>Word-level markup. 
>  - I suppose you could do it wiht a combination of macros, 
>    spans, and user-defined field elements.
>
>-Harry
>

-- 
Patrick Durusau
Director of Research and Development
Society of Biblical Literature
pdurusau@emory.edu