[osis-core] Regex syntax proposal and question

Kirk Lowery osis-core@bibletechnologieswg.org
Tue, 02 Jul 2002 10:13:29 -0400


A comment from a lurker about the regexes:

Patrick, does XML Schema say anything about the flavor of regex one must 
use? Can you use the Perl construct "/regex/x" which allows the use of 
whitespace and comments inside the regex itself? Or else the (?# notation?

Reason I ask is because the complexity of the regexes I'm reading make 
both documentation and interpretation difficult for the implementer as 
well as debugging the schema itself.

I guess this is a plea that for the eventual documentation at least, a 
Perl-style presentation be given with comments so that a newcomer (like 
myself :-) can read and eventually follow the regexes...

Blessings,

Kirk

Patrick Durusau wrote:
> Harry,
> 
> Harry Plantinga wrote:
> 
>>> <xs:simpleType name="osisRef">
>>>     <xs:restriction base="xs:string">
>>>         <xs:pattern
>>> value="(([^\s]*\.){0,6}([^\s]*))(@((cp:(\d*)(\+(\d*))?\((.*)\))|((
>>> x-(\c*):)(.*)\((.*)\))))?((-(([^\s]*\.){0,6}([^\s]*)))?|(-(([^\s]*
>>> \.){0,6}([^\s]*)))(@((char:(\d*)\+(\d*)\((.*)\))|((x-(\c*):)(.*)\(
>>> (.*)\))))?)?"/>
>>>     </xs:restriction>
>>> </xs:simpleType>
>>>
>>
>> You probably don't want to use [\s]* for a token.  That would allow
>> characters like @, (, . in tokens.
>>
> So you would propose [\c]* (removing the negation from ^\s)? (Allows all 
> valid XML name characters)
> 
> 
>>
>>> As modified per Steve's last post on its syntax (char changed to cp,
>>> made range optional (warning, on-the-fly editing).
>>>
>>
>> Also, I think Steve said he like the idea of searching forward from
>> the cp location to the start of the string match.
>>
> Not sure what you saying here in terms of changing the regex syntax, 
> proposed syntax follows, can you edit to illustrate?
> 
> <xs:pattern 
> value="(([\c]*\.){0,6}([\c]*))(@((cp:(\d*)(\+(\d*))?\((.*)\))|((x-(\c*):)(.*)\((.*)\))))?((-(([\c]*\.){0,6}([\c]*)))?|(-(([\c]*\.){0,6}([\c]*)))(@((char:(\d*)\+(\d*)\((.*)\))|((x-(\c*):)(.*)\((.*)\))))?)?"/> 
> 
> 
> Patrick
> 


-- 
Kirk E. Lowery, Ph.D.
Director, Westminster Hebrew Institute
Adjunct Professor of Old Testament
Westminster Theological Seminary, Philadelphia