Re: CCL proposal (quotes)

Not necessarily in order...

"LeVan,Ralph" wrote:
> We said it was z39.58 regular experssions, for whatever reason, and we
> should make it so.  

Alan Kent wrote:
> To me ... the documentation for the Z39.58 regexp attribute
> already in Bib-1 is incomplete - the textual description of the pattern
> has ommitted the Z39.58 documented support for quotes for releasing.

Ray Denenberg wrote:
> Yes, but that was the intent when we defined it, to specify a subset of
> Z39.58 which included the features that people thought they wanted,
> keeping in mind that there already was a regExp-1 and 2....
> (Well, it was a long discussion, and it made sense at the time, as I recall.)

"LeVan,Ralph" wrote:
> I just spent a week with the e-learning community.  I complained to them
> that making more options did not lead to interoperability.  More options
> just make more work for profiling groups.

Ray Denenberg wrote:
> Before we attempt to "fix" the Z39.50 defintion of Z39.58 (a standard that
> doesn't exits anymore), can we please re-examine why regExp-1 (IEEE 1003.2)
> isn't sufficient?

Responding in no particular order, the reason why we personally have
not implemented regExp-1 is that it is hard to optimize when hitting
indexes. We could do a full index scan and check every term, but
its a bit of work. Also the complexity of IEEE 1003.2 is somewhat
daunting. Taking an existing implementation is not necessarily going
to be good enough - we would want to integrate it with the indexing
code.

But its also possible we are just plain lazy!

I can understand if people don't want to change the existing pattern
match operator. I can understand that introducing any change to it
is a change and will make it more complex (except Ralph's change of
only one digit after the ?). My feeling (as I said) is I would prefer
changes to be made inline with a standard (even if now obsolete).

By the way, why is Z39.58 obsolete? Is it because the idea is to use
ISO8777 instead? If so, then the regexp stuff and quoting rules etc
are not obsolete - just deferred to a different standard.

I am not really stressed which ever which ways things go personally.
This does not mean I wont put up agruments - but they are often to
force people to rethink and better justify their own positions.

Regarding what was discussed previously, I do not recall if introducing
quotes into the Z39.58 regexp attribute was discussed or not. From a
technical perspective, I like the quotes for restoration approach because
it means the pattern can specify *any* literal text, it can specify
zero-to-n (where any 'n' can be specified), and you can do one-to-n
by combining '#' with '?' in the same pattern (ab#?). As such, its
a very easy to implement regexp engine. You can specify any literal
text intermixed with any sequence of 'from n to m' unspecified characters.
Simple, easy to implement, semantically logical, easy to understand.
I also like the quotes restoration approach as it is the documented
Z39.58 / ISO8777 way of doing it.

Without the restoration marks you cannot specify all literal text
(there are gaps - you cannot search for '#' or '?' as literal text).
And you cannot, as Ralph pointed out, use a ? length followed by digits.
(Following digits will be consumed as part of the length for the ?).
So technically, the current definition has holes in it.

So quotes is standards based and gives theoretic completeness to
a very simply regexp model.

Regarding whether to change an existing attribute or introduce a new
one, I agree with Ralph's comment that lots of attributes *decreases*
interoperability. So there is benefit in changing the existing
definition. The new definition will only invalidate patterns that
contained double quote characters (which I think is rare) - all other
patterns and semantics remain unchanged. So I think it has pretty
good backwards compatibility.

I agree though that it is a change. If more people supported and used
it, I would be completely against changing the definition. But as I
suspect that no-one currently uses it between different systems, and
because the change is pretty backwards compatible, and because the
old definition is arguably incomplete, I am in favour of introducing
the change to the existing attribute rather than introducing a new
attribute.

Alan

Received on Thursday, 9 May 2002 00:01:14 UTC