- From: Alan Kent <ajk@mds.rmit.edu.au>
- Date: Thu, 9 May 2002 14:00:39 +1000
- To: zig <www-zig@w3.org>
Not necessarily in order... "LeVan,Ralph" wrote: > We said it was z39.58 regular experssions, for whatever reason, and we > should make it so. Alan Kent wrote: > To me ... the documentation for the Z39.58 regexp attribute > already in Bib-1 is incomplete - the textual description of the pattern > has ommitted the Z39.58 documented support for quotes for releasing. Ray Denenberg wrote: > Yes, but that was the intent when we defined it, to specify a subset of > Z39.58 which included the features that people thought they wanted, > keeping in mind that there already was a regExp-1 and 2.... > (Well, it was a long discussion, and it made sense at the time, as I recall.) "LeVan,Ralph" wrote: > I just spent a week with the e-learning community. I complained to them > that making more options did not lead to interoperability. More options > just make more work for profiling groups. Ray Denenberg wrote: > Before we attempt to "fix" the Z39.50 defintion of Z39.58 (a standard that > doesn't exits anymore), can we please re-examine why regExp-1 (IEEE 1003.2) > isn't sufficient? Responding in no particular order, the reason why we personally have not implemented regExp-1 is that it is hard to optimize when hitting indexes. We could do a full index scan and check every term, but its a bit of work. Also the complexity of IEEE 1003.2 is somewhat daunting. Taking an existing implementation is not necessarily going to be good enough - we would want to integrate it with the indexing code. But its also possible we are just plain lazy! I can understand if people don't want to change the existing pattern match operator. I can understand that introducing any change to it is a change and will make it more complex (except Ralph's change of only one digit after the ?). My feeling (as I said) is I would prefer changes to be made inline with a standard (even if now obsolete). By the way, why is Z39.58 obsolete? Is it because the idea is to use ISO8777 instead? If so, then the regexp stuff and quoting rules etc are not obsolete - just deferred to a different standard. I am not really stressed which ever which ways things go personally. This does not mean I wont put up agruments - but they are often to force people to rethink and better justify their own positions. Regarding what was discussed previously, I do not recall if introducing quotes into the Z39.58 regexp attribute was discussed or not. From a technical perspective, I like the quotes for restoration approach because it means the pattern can specify *any* literal text, it can specify zero-to-n (where any 'n' can be specified), and you can do one-to-n by combining '#' with '?' in the same pattern (ab#?). As such, its a very easy to implement regexp engine. You can specify any literal text intermixed with any sequence of 'from n to m' unspecified characters. Simple, easy to implement, semantically logical, easy to understand. I also like the quotes restoration approach as it is the documented Z39.58 / ISO8777 way of doing it. Without the restoration marks you cannot specify all literal text (there are gaps - you cannot search for '#' or '?' as literal text). And you cannot, as Ralph pointed out, use a ? length followed by digits. (Following digits will be consumed as part of the length for the ?). So technically, the current definition has holes in it. So quotes is standards based and gives theoretic completeness to a very simply regexp model. Regarding whether to change an existing attribute or introduce a new one, I agree with Ralph's comment that lots of attributes *decreases* interoperability. So there is benefit in changing the existing definition. The new definition will only invalidate patterns that contained double quote characters (which I think is rare) - all other patterns and semantics remain unchanged. So I think it has pretty good backwards compatibility. I agree though that it is a change. If more people supported and used it, I would be completely against changing the definition. But as I suspect that no-one currently uses it between different systems, and because the change is pretty backwards compatible, and because the old definition is arguably incomplete, I am in favour of introducing the change to the existing attribute rather than introducing a new attribute. Alan
Received on Thursday, 9 May 2002 00:01:14 UTC