[osis-core] OSIS work regex

Patrick Durusau osis-core@bibletechnologieswg.org
Wed, 14 Aug 2002 09:30:00 -0400


Harry,

Harry Plantinga wrote:

>If schema RegExps behave as they do in Perl, the ? is superfluous.
>Perhaps
>
>  [\L\N][\.\L\N]* 
>
Unfortunately, XML Schemas are deliberately inconsistent with customary 
regex expressions. Well, the "?" operator is familiar from SGML DTD 
syntax but the use of "^" as negation, for example, is contrary to its 
usual role as an anchor (the reasoning being that in XML Schemas, 
matches are always anchored at the start of the line. no sure why that 
justifies being inconsistent but there you have it)

>
>The underscore character (_) is pretty commonly used in names and may be
>present in documents converted to OSIS. I can't see that it would do any
>harm. Could it be included?  Perhaps 
>
> [\L\N_][\.\L\N_]*
>
Actually I think:

(\p{L}\p{N}_)((\.(\p{L}\p{N}_)*)

(untested)

Steve: comments on adding the underscore?

Patrick

>
>-Harry
>
>----------------------------------
>For the work portion:
>
><xs:pattern value = "([\L\N\.]([\L\N\.]*)?)" />
>
>By which I am trying to say, any letter or number combination, followed 
>by a period is complusory, followed by any number of optional 
>letter/number combinations that also end in a period (periods, hyphens, 
>etc., being excluded from the work name).
>

-- 
Patrick Durusau
Director of Research and Development
Society of Biblical Literature
pdurusau@emory.edu