[sword-devel] Sword OSIS quotation mark handling question
dmsmith555 at yahoo.com
Mon Apr 30 10:09:36 MST 2007
Kahunapule Michael Johnson wrote:
> How does the Sword project handle display of OSIS text quotations when:
> 1. the <q> or <speech> element is used without a marker attribute,
The speech element is not handled, except to process its content. It is
as if the element were not in the text at all. I think the speech
element is to indicate the speaker, not that what's said is a quote. I
won't mention the element <speech> below.
Assuming that the module's conf does not have osisQToTick=false (i.e. it
defaults to true when not present), then the level attribute determines
the quotation mark that will be used, alternating double quote and then
single quote. If no level attribute is present, then it uses a double quote.
It will use the same mark when it gets to </q>.
The same holds true when milestoned versions of <q> are used, except
that <q eID="xxx"/> elements will not cause the code to look at the
opening <q sID="xxx"/> for a marker attribute. Instead, it will use the
marker attribute, or it's lack to determine what to output.
However, if osisQToTick=false, no quotation mark is used.
> 2. the <q> or <speech> element is used with a marker attribute,
When the marker attribute is present, it is used.
> 3. no <q> or <speech> elements appear, or
Then as far as sword is concerned then it is not in a quote.
> 4. quotation punctuation (“, ‘, ’, ”, «, », —, newline, etc.) appears
> outside of <q> or <speech> elements (i. e., not in a marker attribute)?
Any punctuation in the text is produced as is.
Another feature of OSIS is <milestone type="cQuote" marker="xxxx"/>
This is used for a continuation quote. (substitute xxxx with the
appropriate quote mark)
Words of Christ (WoC) can be indicated by adding who="Jesus" to the <q>
container element or to both the milestone elements.
In the KJV, ESV and upcoming NASB modules, the WoC are marked on a per
verse basis, using the container form of <q>, with marker="".
> I want to (1) ensure that Bible texts are displayed correctly, and (2)
> minimize the amount of manual labor necessary to make #1 happen.
> It should not be necessary to do any manual editing of Bible source
> texts in well-formed Unicode USFM to create a valid Sword module. (USFM
> or something close to it is the format in which a very large number of
> minority-language Bibles exist.) In USFM, quotation punctuation, if any,
> is in the text of the document, with no special markup. In an informal
> extension to USFM, sometimes << is used for “, < for ‘, etc. (A space is
> required to disambiguate “‘ and ‘“.) Speaking of ambiguity, apostrophe,
> closing single quote, and (in some languages) glottal stop all use the
> same character. This ambiguity, coupled with language and style
> considerations, seems to be a serious problem in automatically
> converting from either GBF or USFM to OSIS, in general.
I have recently written a quote recognizer in C++. I did find that an
apostrophe is potentially ambiguous, but in the source I was working, it
was not an issue.
Fortunately, my input use ` for a single quote start and ' for an end
quote. This made disambiguation significantly easier.
If you wish, I can send you the routine.
> I'm wondering if I should target OSIS or GBF as a target format for a
> converter I'm writing, and also working on updating the dialect of OSIS
> that the World English Bible and HNV are distributed in. While I'm not
> in favor of dropping support for GBF, yet, I'm not very thrilled about
> the idea of putting any new work into supporting it, either. However, if
> I can't make an OSIS module without a lot of manual labor, any
> reasonable alternative is worth considering.
Remembering your earlier posts about OSIS's lack of quotation support, I
think I can now say that it provides you the level of control that you
wish. Having done three modules myself, I think that OSIS 2.1.1 is
sufficient for Bible texts.
So, I'd suggest OSIS.
More information about the sword-devel