[osis-core] Grain regex

Patrick Durusau osis-core@bibletechnologieswg.org
Tue, 13 Aug 2002 14:27:07 -0400


Todd,

Todd Tillinghast wrote:

>>Todd,
>>
>>cp = character position
>>
>>str = string
>>
>>Lack of delimiter on cp was due to lack on one on the regex Steve and
>>
>I
>
>>wrote. ;-) Inserted ":" on the expressions I am about to send. Not
>>wedded to it, suggest [ and ] for consistency with str?
>>
>>I am trying with \L and \Nd to allow all letters and digits. (Note
>>
>that
>
>>character position can only be \Nd, if that is the question? May be
>>missing the real question so if I am, please repeat.)
>>
>
>That was the question and a clear answer!
>
>It would be nice if str and cp were consistent in their delimiters.
>
Hmm, did it differently because cp does not need a container around 
...., well, actually that might be nice.

Votes on adding square brackets to cp? In other words, cp\[\p{Nd}\]  (in 
the crystal clear schema syntax).

>
>Why string and not word?  Is word to Western oriented?
>
Not really, although word tokens being white space delimited is Western 
oriented. More pragmatic reason that you may want to address an 
arbitrary string, which may or may not contain whole words at each end. 
Doesn't cost any more and you get more robust addressing. Would be more 
difficult to specify and implement white space delimited tokens only, 
and the ultimate ability is less useful. (I know, an argument for MS to 
do it that way but I prefer the other result. ;-)

Note that unless there is a very good reason to not do the square 
brackets that lands in my inbox by tomorrow morning, I will issue 1.1 
(as a beta for buidling texts) final to appear by say 1 September(?) by 
tomorrow at NOON.

Patrick

>
>Todd
>
>>Patrick
>>
>>Todd Tillinghast wrote:
>>
>>>What meaning do "cp" and "str" carry?
>>>In the past we had a
>>>@char:45
>>>@word:12
>>>@enum:a
>>>
>>>also there have been examples of simple forms like
>>>@a
>>>@1
>>>
>>>This looks like
>>>@ch1
>>>@ch111
>>>@str[X]
>>>
>>>Can you say in words what you are trying to do with \Nd and \L.  Why
>>>
>a
>
>>>difference between ch and str?
>>>
>>>Why no [] or some other delimiter after ch?
>>>
>>>>-----Original Message-----
>>>>From: owner-osis-core@bibletechnologieswg.org [mailto:owner-osis-
>>>>core@bibletechnologieswg.org] On Behalf Of Patrick Durusau
>>>>Sent: Tuesday, August 13, 2002 6:02 AM
>>>>To: osis-core
>>>>Subject: [osis-core] Grain regex
>>>>
>>>>Guys,
>>>>
>>>>Thanks to Todd's timely help (must be up early (or late)) I have
>>>>
>been
>
>>>>able to avoid a rather serious error in the regex expressions. Not
>>>>
>>>that
>>>
>>>>some won't remain but at least one avoided.
>>>>
>>>>Suggested grain syntax:
>>>>
>>>><xs:pattern value="(@(cp([\Nd])* | str\[([\L])*\]))?" />
>>>>
>>>>I went ahead an put the optional parens around grain.
>>>>
>>>>Note that I am escaping square brackets that will act as a container
>>>>
>>>for
>>>
>>>>the value that follows str.
>>>>
>>>>OK, onward to putting some of it together into a fuller regex!
>>>>
>>>>Patrick
>>>>
>>>>--
>>>>Patrick Durusau
>>>>Director of Research and Development
>>>>Society of Biblical Literature
>>>>pdurusau@emory.edu
>>>>
>>--
>>Patrick Durusau
>>Director of Research and Development
>>Society of Biblical Literature
>>pdurusau@emory.edu
>>
>>
>

-- 
Patrick Durusau
Director of Research and Development
Society of Biblical Literature
pdurusau@emory.edu