[osis-core] Regex Solved! ????

Patrick Durusau osis-core@bibletechnologieswg.org
Fri, 31 May 2002 07:15:00 -0400


This is a multi-part message in MIME format.
--------------050105050004010706010803
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit

Guys,

Have attached a tiny schema and test file for the regex expression. 
Notes in XML comments (Steve, should we still call <!-- --> SGML 
comments? ;-) just something to annoy the WC3 fanatics.)

Looks like it validates the ref/grain syntax that Steve originated. Note 
that it allows both character offset or whatever other grain pointing 
(read XPath/XPointer) mechanism that you wish to use.

(Hate to mention this but should we also allow the ref part the same 
ability? Don't think it would be that hard to add.)

Todd: Going through the latest version of the schema, which will include 
the new regex and (sigh) <seg>. I hate it when people convince me to 
change my opinion! ;-) D**ned use cases! Can't see what anybody sees in 
them but trouble. ;-)

Seriously, you guys are the finest! Will work on Todd's problem cases as 
soon as I finish the latest revision of the schema and comparing it 
against the Rome issues list. Translation: schema before lunch, comments 
on Todd's cases probably late this afternoon. (Sorry for the delay!)

Patrick

-- 
Patrick Durusau
Director of Research and Development
Society of Biblical Literature
pdurusau@emory.edu


--------------050105050004010706010803
Content-Type: text/plain;
 name="regex.xsd"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="regex.xsd"

<?xml version="1.0" encoding="UTF-8"?>

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"

	elementFormDefault="unqualified">



<xs:element name="text">

	<xs:complexType>

		<xs:sequence>

			<xs:element ref="body" minOccurs="1" maxOccurs="1"/>

		</xs:sequence>

	</xs:complexType>

</xs:element>



<xs:element name="body">

	<xs:complexType>

		<xs:sequence>

			<xs:element ref="p" minOccurs="1" maxOccurs="unbounded"/>

		</xs:sequence>

	</xs:complexType>

</xs:element>



<xs:element name="p" type="osisRef"/>



<xs:simpleType name="osisRef">

	       <xs:restriction base="xs:string">

			   <!--    <xs:pattern value="(([^\s]*\.){0,6}([^\s]*))"/> first part valid -->

			   <!-- <xs:pattern value="@((char:(\d*)\+(\d*)\((.*)\))|((x-(\c*):)(.*)\((.*)\)))"/> second part valid, remember to make optional with ? -->

			   <!-- <xs:pattern value="(-(([^\s]*\.){0,6}([^\s]*)))(@((char:(\d*)\+(\d*)\((.*)\))|((x-(\c*):)(.*)\((.*)\))))?"/> third part valid, combines second ref and grain -->



<!-- now to combine 1, 2, and 3, making ref1 required, grain optional (all cases, so you could have ref1-ref2), ref2 optional, with the ref2 grain optional) Note that I have repeated the expression for the second ref so that you can have the second ref or you can have the second ref plus the grain but not ref1 or ref1 plus grain1 and then grain2. to list grain2, must have ref2. made the regex a little less easy to see but just a little validation. -->



			     <xs:pattern value="(([^\s]*\.){0,6}([^\s]*))(@((char:(\d*)\+(\d*)\((.*)\))|((x-(\c*):)(.*)\((.*)\))))?((-(([^\s]*\.){0,6}([^\s]*)))?|(-(([^\s]*\.){0,6}([^\s]*)))(@((char:(\d*)\+(\d*)\((.*)\))|((x-(\c*):)(.*)\((.*)\))))?)?"/>



		</xs:restriction>

</xs:simpleType>



</xs:schema>
--------------050105050004010706010803
Content-Type: text/xml;
 name="regex.xml"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="regex.xml"

<?xml version="1.0" encoding="UTF-8"?>

<text xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="C:\downloads\osis-current\regex\regex.xsd">

  <body>

	<p>Matt.1.1-Matt.1.3</p>

        <p>Matt.1.2@123+134(logos)</p>

        <p>Matt.1.5@123+134(logos)-Matt.1.6</p>

        <p>Matt.1.5@123+134(logos)-Matt.1.6@234+236(Uriah)</p>

        <p>Matt.1.5@x-xpath:\text\div\p\line(Asaph)</p>

        <p>Matt.1.5@x-xpath:\text\div\p\line(Asaph)-Matt.1.6</p>

        <p>Matt.1.5@x-xpath:\text\div\p\line(Uriah)-Matt.1.6@x-xpath:\text\div\p\line(Asaph)</p>

  </body>

</text>


--------------050105050004010706010803--