Re: SV: CCL Regex proposed change

On Wed, Apr 24, 2002 at 08:20:21PM +0200, Henrik Dahl wrote:
> Alan Kent,
> 
> Can't you try to state your motivation for the idea?
> 
> Henrik Dahl

Sorry, I thought I had. The current CCL pattern match expression
attribute in Bib-1 says ? and # have the meanings as in CCL patterns.
# matches any single char, ? matches zero or more chars, ?n matches
zero to n chars. Someone pointed out that for 12?34 it was not
really useful to say 'up to 34 chars' and so that it might be better
to say at most one digit after the ? char. If you want up to 15
chars, you can say ?9?6 (total 15). I then said that we solved
the problem in our CCL parser (not in the CCL pattern match expression
attribute) using quotes (as per the CCL spec). So we would write
it as 12?3"4" to make it clear that there was no special meaning
for the '4'. It also meant you could search for '#' and '?' characters
in patterns, which is currently not supported.

So I proposed to extend the definition of the CCL pattern attribute
value in Bib-1 to follow the CCL convention of using double quotes
to release any special meaning of characters (two double quotes in
a row releases a double quotes char). At present using the CCL
pattern attribute it is not possible to ask for all words starting
with the charcter '#', or all words ending with the character '?' etc
because there is no release mechanism for these chars at present.

While this may not be a common problem (most people probably discard
punctuation from indexes - but this is certainly not mandated by
Z39.50), it seemed to solve both the original requested problem
and other problems too.

We ended up, for example, inventing our own private attribute for
defining another form of CCL pattern match operator to allow us
to search for '#' and '?' etc in patterns. There are things the
current CCL regexp attribute cannot support, which could be fixed
with the introduction of CCL compatible usage of double quotes
to release special meanings of chars in patterns.

Is this clearer?

Alan

Received on Thursday, 25 April 2002 00:13:13 UTC