[osis-core] Lists in Attribute values: final call

Todd Tillinghast osis-core@bibletechnologieswg.org
Mon, 20 Oct 2003 10:47:06 -0600


Patrick,

Reasons to use xsd:list:
1) We already use a xsd:list for osisID, osisRef, and annotateRef and a
"|" separator would introduced a second mechanism.
2) If we use "|" as a list separator inorder to accommodate identifier
values that include whitespace we will not be able to use those
identifiers in annotateRef.
3) We should anticipate general support for xsd:list by parsers in the
near future but will never expect to see a "|" separated list of
identifiers.

Todd


> -----Original Message-----
> From: osis-core-admin@bibletechnologieswg.org [mailto:osis-core-
> admin@bibletechnologieswg.org] On Behalf Of Patrick Durusau
> Sent: Monday, October 20, 2003 8:34 AM
> To: osis-core@bibletechnologieswg.org
> Subject: [osis-core] Lists in Attribute values: final call
> 
> Greetings!
> 
> Well we have spilled a lot of ink, errr, electrons on this one!
> 
> At the heart of the dispute seems to me to be how one declares and
> treats lists in XML attribute values.
> 
>  From an XML standpoint, it is really quite simple, if you want a list
> in an attribute value, it is a space delimited list and that excludes
> any values in the list that have spaces. End of discussion.
> 
> On the other hand, the no white space in the values is an arbitrary
> limitation of XML lists, which may not conform to the data that we
wish
> to store in such lists.
> 
> Now the argument can be made (and has been made) that we can reform
the
> values that are to be placed in such lists (substitute underscores,
> etc.) for the values as seen by a user entering the text.
> 
> The major problem with the reformation argument is that I tend to type
> what I am familiar with more accuracy and consistency than I do if I
try
> to conform to an unfamiliar practice. Even when I know I should be
using
> an underscore or some other character, I will slip and if the prefix
is
> optional, there is no XML error to alert me to the error. (That is if:
> pld:123 is valid, pld:123_567 is valid, but pld123 567 should not be.
I
> don't have a prefix on 567 and actually there should not be one
because
> I really meant: pld:123_567.
> 
> Now, using that same example, I can also write a list as
> "pld:123|pld:123 567" because I am not using the XML list mechanism
and
> can have spaces, so long as the separator does not otherwise appear in
> the string.
> 
> I can even validate that expression by requiring the "|" symbol
between
> the parts of the list, thus:
> 
> <xs:pattern
>
value="(((\p{L}|\p{N}|_)+)(\.(\p{L}|\p{N}|_))?):((\p{L}|\p{N}|\p{Zs}|_|\
.|
> \-
>
)*)?(\|(((\p{L}|\p{N}|_)+)(\.(\p{L}|\p{N}|_))?):((\p{L}|\p{N}|\p{Zs}|_|\
.|
> \-)*)?)?"/>
> 
> Yeah, ugly isn't it?
> 
> The point of all this being that we are faced with two ways to handle
> lists in attribute values:
> 
> 1. XML list (white space delimited)
> 
> 2. Delimited by some other separator (in the example the pipe "|" sign
> 
> Either way, the list must be processed by software to do more than
find
> something is in the list. So the question is: Does it really make any
> difference to an application whether it splits on the "|" or on a
white
> space.
> 
> My sympathies are with the XML method but I do now know that there are
> POS values (in modern Hebrew) that do have spaces.
> 
> Could take the path of saying that data has to be reformed to meet our
> specifications but that introduces user error.
> 
> Where I am coming out on this is that I don't see the benefit of
> following the whitespace protocol of the XML standard. Won't be
> processed meaninfully by an XML parser anyway so I am not sure what
that
> gets us for these cases.
> 
> Note that I am aware of the uses of list where you have an enumerated
> set of values to validate against an attribute value restriction, but
so
> far as I know, no one has proposed such a set for any of these
> attributes. That would be a case for making it a list but I would be
> real leary of saying that everyone had to use our names for their
> linguistic categories.
> 
> Got to run, have to eat my snack and jump into a conference call on
> OpenOffice.
> 
> Will try to make the rounds this afternoon so we can get back on
schedule.
> 
> Hope everyone is in good health and spirits!
> 
> Patrick
> 
> --
> Patrick Durusau
> Director of Research and Development
> Society of Biblical Literature
> Patrick.Durusau@sbl-site.org
> Chair, V1 - Text Processing: Office and Publishing Systems Interface
> Co-Editor, ISO 13250, Topic Maps -- Reference Model
> 
> Topic Maps: Human, not artificial, intelligence at work!
> 
> 
> _______________________________________________
> osis-core mailing list
> osis-core@bibletechnologieswg.org
> http://www.bibletechnologieswg.org/mailman/listinfo/osis-core