[sword-devel] Re: The death of OSIS?

Kahunapule Michael P. Johnson Kahunapule at mpj.cx
Wed Aug 11 05:09:49 MST 2004

At 19:37 11-08-04, Patrick Durusau wrote:
>As one of the principals in the OSIS project I must confess I an not 
>encouraged by statements announcing the illness or death of OSIS.

I'm sorry. My intention was not to discourage you, but to get your attention, and to let you know just how serious I think the situation is. This is NOT the first time I have brought up the one issue that I think alone could kill OSIS, at least as far as I'm concerned. However, I don't think I ever succeeded in conveying the seriousness of my concern, and I think the urgency of my request was totally lost in the shuffle.

I'll try again to make some constructive suggestions. If that works, then I can become your ally instead of your competitor. I would much prefer to work with you than against you, but I'll do whatever it takes to do my job well. Please read this message carefully, as it contains constructive criticism, and your response will affect what I do in developing software that does or does not support OSIS this month.

>Rather than simply taking shots from the cheap seats, perhaps you would 
>like to suggest markup based solutions to any problems you encounter 
>with OSIS?

I will gladly provide you with a markup based solution suggestion. I fully understand the difference between markup for meaning and presentation stuff such as fonts and placement on a page, etc. I expect markup for a Bible, such as OSIS, to specify all of the text of the Bible, including all alphabetic characters, punctuation, etc., plus structural markers such as where books, chapters, verses, paragraphs, titles, etc. are. I expect that if I encode a Bible text using a standard like OSIS, and another process reads that same file with reference to the same standard, it will be capable of reproducing the full text of the Bible and its associated information (footnotes, subtitles, etc.) in some reasonable way. It may be displayed in HTML, converted to PDF for a pocket Bible, printed in large type for the vision impaired, or displayed on a PDA screen. Whatever the details of the formatting, I insist that the text and punctuation are all there. I also have a strong preference that certain style elements are preserved, such as poetry and prose formatting. Some Bible study programs (the better ones) preserve poetry & prose formatting. Some do not. However, all Bible study software that I'm aware of at least tries to get the alphabetic and punctuation characters right.

Are you still with me?

The problem I have with OSIS (at least the version of documentation that I have) is that it does not encode enough information to reliably reconstitute quotation mark punctuation for the range of languages and Bible translations that I work with. It doesn't even cover English properly. The reason is that you state in the documentation that quotations should be marked with <q who="Nameofspeaker" sID="someuniquething">....<q who="Nameofspeaker" eID="someuniquething"> and NOT with the quotation marks. This is OK for SOME situations; to wit: standard English texts using the same quotation punctuation rules as the NIV, and Bible texts in languages that happen to use the same characters and rules for quotation marks. This is NOT OK for other situations; to wit: English texts using different quotation mark styles (like the NASB) or no quotation marks at all (like the KJV). It occurs to me that by just ignoring <q> and <speech> altogether, I could put in the normal quotation punctuation for the given language as Unicode characters in the right places and be happy-- except for two things.

One is that I want to encode some (but not all) of the Bible texts for "red letter" editions. Actually, I don't really mean to specify that the words of Jesus have to be in red. I just want to mark the direct quotes of Jesus in a way that makes it easy for those who wish to present the Bible text to display the direct quotes of Jesus in red (or some other distinctive way) if they want to. I don't even care if people display Jesus' direct quotes in red or not, but I do care that if they do, the markers are in the right places so that the correct words are marked. I can use <q who="Jesus" sID="book.chapter.verse.0">...<q who="Jesus" eID="book.chapter.verse.0"> for that, but then if I do that for the KJV, will the application reading the OSIS file add quotation marks? If I use OSIS for a language that uses different quotation marks, what will happen? What about open quote reminders at new paragraphs and stanzas? Will they be inserted when they aren't supposed to be?

The other problem with controlling quotation punctuation with OSIS and always using markup (i. e. q or speech elements) is that there are not just start and end locations. There are also open quote reminder locations. This gets confusing. Can I specify that a quotation starts at a given location with one character, continues at a paragraph boundary with a different character, then ends with still another character? Would it be OK to use a duplicated sID in a q milestone element to indicate that this is a part of the same quotation, but more punctuation is needed here?

In short, I consider the placement of quotation punctuation and the selection of characters to be used for quotation punctuation to be a part of the Bible translation text itself, and if any encoding, like OSIS, cannot guarantee that these characters are maintained in their original locations, then that encoding is defective.

Do you see the problem?

Now, let me suggest at least two possible solutions that are easy to incorporate into the OSIS standard. First, let me explicitly state what I'm trying to accomplish:

1. Preserve the current OPTION in OSIS to generate quotation punctuation with markup.

2. Preserve the OPTION in OSIS to mark quotations by speaker for specialized searches or, in the case of Jesus' direct quotes, to color or present them in some different way.

3. Add the OPTION to control quotation punctuation precisely for languages and styles that differ from the "usual" in the type and placement locations of quotation punctuation.

Suggested solution number 1 (recommended):

Document that any <q> or <speech> element marked with an attribute of n=" " (a blank space) should not be taken as an instruction to insert any quotation mark. Rather, in this case, it should be assumed that the correct punctuation is already in the text as a Unicode character (just like other kinds of punctuation). <q> or <speech> elements not so marked would be taken as an instruction to insert quotation punctuation in the manner that the NIV English Bible does, including open quote reminders, and alternating double and single typographic quotes for nested quotes.

Suggested solution number 2:

If for some bizarre reason you are opposed to letting quotation punctuation exist as a normal Unicode character in the text, you could (1) allow the exact character to be used to be specified with its hexadecimal code position in the n attribute of the p or speech element, and (2) define two other elements to specify if open quote reminders are appropriate at new paragraphs and stanzas, and (3) specify what the open quote reminder character should be.

Suggested solution number 3:

Make something up-- anything that solves the problem above, and ask me if I think it would work or not.

By the way, I would be happy to help you proofread and review the next release of OSIS documentation and schema.

>Hope you are having a great day!

I am. It is about my bed time, now...

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.crosswire.org/pipermail/sword-devel/attachments/20040811/4d2d8669/attachment.html

More information about the sword-devel mailing list