[sword-devel] XML attribute delimiters in OSIS files?

DM Smith dmsmith at crosswire.org
Wed Oct 26 09:38:35 MST 2011

On 10/26/2011 09:47 AM, Peter von Kaehne wrote:
> Is there any actual credible reason for having quotation marks in attributes? I agree that it may be grammatically correct for XML as such, but OSIS's attributes are defined and do not contain quotation marks. And x-marked attributes are largely thrown out during the osis2mod run, no? Or at least ignored - apart from our own - like x-preverse.
> Peter

I had never spent the time to look at the allowable attribute values in 
an OSIS document. Now, having looked at the schema, it is allowed to 
nest quotes. See below for details.

I think there are many good reasons that a single quote will be found in 
an attribute value. Many languages use it for other things than quoting.

I can only think of a few, probably obscure, reasons for a double quote 
to be there. E.g chapterTitle='xxx aka "yyy"', who='James "Jimmy" 
Smith', ...

Osis2mod *should* allow for all well-formed, valid (both syntactically 
and semantically) OSIS documents. Regarding quoting attribute values, 
the recommendation still stands, use double quotes if at all possible, 
but also avoid " and ' too. (Note that these entities are only 
needed within attribute values and never elsewhere in the text.)

(Below I'm using x at y to mean element x with attribute y.)

In looking at this, I think there are some bugs in the definition of 
l at type, lg at type, and rdg at type.

In Him,

Here are the attributes that allow for arbitrary text:
actor at who
<xs:attribute name="who" type="xs:string" use="optional"/>
contributor at file-as
<xs:attribute name="file-as" type="xs:string" use="optional"/>
a at href
<xs:attribute name="href" type="xs:string" use="required"/>
abbr at expansion
<xs:attribute name="expansion" type="xs:string" use="optional"/>
chapter at chapterTitle
<xs:attribute name="chapterTitle" type="xs:string" use="optional"/>
figure at alt, @catalog, @location, @rights, @size, @src
<xs:attribute name="alt" type="xs:string" use="optional"/>
<xs:attribute name="catalog" type="xs:string" use="optional"/>
<xs:attribute name="location" type="xs:string" use="optional"/>
<xs:attribute name="rights" type="xs:string" use="optional"/>
<xs:attribute name="size" type="xs:string" use="optional"/>
<xs:attribute name="src" type="xs:string"/>
index at index, @level1, @level2, @level3, @level4, @see
<xs:attribute name="index" type="xs:string" use="required"/>
<xs:attribute name="level1" type="xs:string" use="required"/>
<xs:attribute name="level2" type="xs:string" use="optional"/>
<xs:attribute name="level3" type="xs:string" use="optional"/>
<xs:attribute name="level4" type="xs:string" use="optional"/>
<xs:attribute name="see" type="xs:string" use="optional"/>
item at role
<xs:attribute name="role" type="xs:string" use="optional"/>
label at role
<xs:attribute name="role" type="xs:string" use="optional"/>
milestone at marker
<xs:attribute name="marker" type="xs:string" default="DEFAULT" 
milestoneEnd at start
<xs:attribute name="start" type="xs:string" use="required"/>
milestoneStart at end
<xs:attribute name="end" type="xs:string" use="required"/>
name at regular
<xs:attribute name="regular" type="xs:string" use="optional"/>
q at level, @marker, @who
<xs:attribute name="level" type="xs:string" use="optional"/>
<xs:attribute name="marker" type="xs:string" default="DEFAULT" 
<xs:attribute name="who" type="xs:string" use="optional"/>
speaker at who
<xs:attribute name="who" type="xs:string" use="optional"/>
speech at marker
<xs:attribute name="marker" type="xs:string" default="DEFAULT" 
title at short
<xs:attribute name="short" type="xs:string" use="optional"/>
w at gloss, @src, @xlit
<xs:attribute name="gloss" type="xs:string" use="optional"/>
<xs:attribute name="src" type="xs:string" use="optional"/>
<xs:attribute name="xlit" type="xs:string" use="optional"/>
Globally (globalWithType, globalWithoutType)
@annotateWork, @resp, @n
<xs:attribute name="annotateWork" type="xs:string" use="optional"/>
<xs:attribute name="resp" type="xs:string" use="optional"/>
<xs:attribute name="n" type="xs:string" use="optional"/>
Milestone attributes
@sID, @eID
<xs:attribute name="sID" type="xs:string" use="optional"/>
<xs:attribute name="eID" type="xs:string" use="optional"/>
osisID, osisRef, osisAnnotateType regexes allowing quotation marks: 
(look for [^...] constructs)
<xs:pattern value="((((\p{L}|\p{N}|_)+)(\.(\p{L}|\p{N}|_))*:)?([^:\s])+)"/>
Attribute extension regex:
<xs:pattern value="x-([^\s])+"/>
l at type
<xs:union memberTypes="osisLine attributeExtension xs:string"/>
lg at type
<xs:union memberTypes="osisLineGroup attributeExtension xs:string"/>
<xs:simpleType name="osisLineGroup">
<xs:restriction base="xs:string">
<!-- <xs:enumeration value="doxology"/> -->
rdg at type
<xs:union memberTypes="osisRdg attributeExtension xs:string"/>

> -------- Original-Nachricht --------
>> Datum: Wed, 26 Oct 2011 08:59:14 -0400
>> Von: DM Smith<dmsmith at crosswire.org>
>> An: SWORD Developers\' Collaboration Forum<sword-devel at crosswire.org>
>> Betreff: Re: [sword-devel] XML attribute delimiters in OSIS files?
>> Ah, now I understand. This is a bug. And should be fixed. (BTW, not having
>> the entire thread reproduced in each email makes it harder to understand
>> the context of the email. I don't like having to go digging for the context.
>> Having looked, I see that the first email in the thread defines
>> delimiters.)
>> But I'm not sure where it should be fixed. I haven't looked at the code,
>> but as I recall, we use the SWORD parser to obtain the attribute value. My
>> guess is that it is returning it with the quotes. If the problem is there
>> and we fix it there, it may break a whole host of other things. (This parser
>> is not a true XML parser, but one that is highly optimized for speed and
>> thus we work with it's definition.)
>> It should be easy to change osis2mod to work. I'll look into doing this
>> soon.
>> That said, it is and has been the recommendation that double quotes be
>> used to wrap attribute values. It is valid to use single quotes, but it may
>> (does) expose bugs. Fixing this bug does not change this recommendation.
>> Until osis2mod has been changed and it is available, it is advisable to
>> change the input so that the quoting of sID/eID pairs to be identical.
>> In Him,
>> 	DM
>> On Oct 26, 2011, at 6:38 AM, David Haslam wrote:
>>> Mixing double and single quotes, as per earlier messages in this thread.
>>> Example (minus the chaff):
>>> sID="reference"
>>> .....
>>> eID='reference'
>>> But this time for the same verse, just as Chris replied, rather than in
>>> completely separate OSIS elements.
>>> As this is just an observation, I see no immediate need to give a
>> detailed
>>> example of what happens to the module.
>>> To locate the places where I spotted it yesterday would take some time.
>>> Perhaps the most interesting thing is that there was no error message
>> from
>>> osis2mod.
>>> And I agree with Chris, the OSIS needs fixing first, before using as
>> input
>>> for osis2mod.
>>> David
>>> --
>>> View this message in context:
>> http://sword-dev.350566.n4.nabble.com/XML-attribute-delimiters-in-OSIS-files-tp3907261p3940110.html
>>> Sent from the SWORD Dev mailing list archive at Nabble.com.
>>> _______________________________________________
>>> sword-devel mailing list: sword-devel at crosswire.org
>>> http://www.crosswire.org/mailman/listinfo/sword-devel
>>> Instructions to unsubscribe/change your settings at above page
>> _______________________________________________
>> sword-devel mailing list: sword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/sword-devel
>> Instructions to unsubscribe/change your settings at above page

More information about the sword-devel mailing list