[sword-devel] Sword OSIS quotation mark handling question

DM Smith dmsmith555 at yahoo.com
Tue May 1 06:09:19 MST 2007

Kahunapule Michael Johnson wrote:
> So... it sounds like I could simply convert USFM to OSIS with the
> obvious conversions (like \p ... -> <p>...</p>) plus

Remember to have the <p> surround the entire paragraph.

In various source I have seen the equivalent of \p be nothing more than 
a paragraph separator, with the ambiguity that the first verse of 
chapters does not have a \p. There may be paragraphs that don't begin or 
end on a chapter boundary.

osis2mod will convert the open and close tags to <lb 
type="x-begin-paragraph"/> and <lb type="x-end-paragraph"/>, 
respectively. These x- types are non-standard, but they allow a lossless 
reconstruction of the original.

> \qt -> <seg type="otPassage" sID="someid"/>
> \qt* -> <seg type="otPassage" eID="someid"/>

When you use sID/eID, OSIS "requires" that they be paired and each pair 
have unique values. Sword does not care at this point in time about this.

I found having a stack for each distinct milestone usage (e.g. <q>, 
<seg>, <div>) it is constructive to have a stack and a counter. When a 
open element is found, its counter is pushed onto the stack and 
incremented. When an close element is found, it is popped off the stack. 
For quotes, I find the depth of the stack useful for populating the 
level attribute. If when the document is finished the stack is 
non-empty, then I have a bug somewhere.

> \wj -> <q who="Jesus" marker="" sID="someid2"/>
> \wj* -> <q who="Jesus" marker="" eID="someid2"/>

With the WoC, I would ask, selfishly, that you use the container form of 
<q>, that is
<q who="Jesus">...</q>

\wj -> <q who="Jesus">
\wj* -> <\q>

JSword cannot handle the milestoned form at this time.

> << -> <q marker="“" sID="x"/> (unless at the beginning of a paragraph
> with an unended quotation in progress, then <milestone type="cQuote"
> marker="“"/>
>>> -> <q marker="”" eID="x"/>
> < -> <q marker="‘" sID="y"/> (unless at the beginning of a paragraph
> with an unended quotation in progress, then <milestone type="cQuote"
> marker="‘"/>
>> -> <q marker="’" eID="y"/>
> I think at this point, my best option to convert quotation punctuation
> found in the text (not as << type markup) is to just leave it in the
> text and not try to disambiguate apostrophes. It should display properly
> anyway. 

For a general purpose converter, that deals with apostrophes having 
meanings that differ according to the language and the text, it 
reasonable to not disambiguate them.

> Since the q elements generated from WoC (\wj) markup will never
> span verses in this implementation, but the actual quote often does, it
> is probably better to not combine the two resulting q elements at the
> beginning and end of the quotation into one q element, because then the
> start/end points wouldn't line up properly for one or the other of the
> meanings of the element (quotation start/end marking vs. text coloration).

Right. The two cannot be combined. Precisely because they are two 
different semantics, differing in markup and in meaning.

There are some instances where there is an "island" of text, a gloss, a 
parenthetical statement, by the book's author, in the WoC that does not 
force the quote to begin and end around it, but needs to be 
distinguished from it.

> Does that make sense?

It makes sense.

> Chris Little wrote:
>>>> There's always <hi type="small-caps">. :) (Um... not that I would ever 
>>>> recommend such heresy as taking semantic markup from USFM and turning it 
>>>> into presentation in OSIS.
> Of course not. ;-)

More information about the sword-devel mailing list