- From: Mark Reichert <markr@sirs.com>
- Date: Tue, 7 May 2002 07:55:41 -0400
- To: "Z39.50 LISTSERV" <www-zig@w3.org>
I knew I had it around somewhere. Z39.58-1992 is/was clear on this matter. 7.7.2.1 ... When ? is immediately followed by a positive integer, it shall be used to indicate a limited range of characters to be masked, from zero up to and including the specified integer.... To search embedded numbers, restoration marks are required. See Section 7.7.7. 7.7.7 In order to use a reserved command word, abbreviation, symbol, or operator as a search word, double quotation marks, " ", shall be used to restore its literal meaning.... FIND 0?10"5" // ten zeroes followed by a five (my example) FIND C?"14" // word beginning with C, ending in 14 (from Z39.58 appendix) # has no interaction with digits: Multiple #s shall be used to indicate that a precise number of characters greater than one and qual to the number of # symbols are to be masked (7.7.2.2). The standard never offered an explicit explanation/example of restoring ", but presumably by 7.7.2.1... FIND """Some text in quotes""" There is no mention of the more typical "" escaping. The portion of a <search_term> that corresponds to restoration is: {<restoration><word>[<space><word>]...<restoration>} <restoration> ::= [<space>}<">[<space>] <space> ::= < >[< >}... <word> ::= {<char>|<var_mask>|<exact_mask>}... <var_mask> ::= <?>[<positive_integer>] <exact_mask> ::= <#>[<#>]... <char> ::= <any_searchable_char> <any_searchable_char> ::= any character locally defined as searchable ----- Original Message ----- > Not making it to the ZIG, someone sent me some private mail indicating > that Ralph's proposed single digit after '?' change got accepted > and possibly no-one mentioned my counter double quotes suggestion. > Fair enough, if you don't turn up you have less influence. > > Just thought I would have a last bash at a compromise with the idea > that if the CCL regexp is changing, may as well try and get as many > changes in as possible in one hit rather than change it again later. > > To repeat the problem I currently have with the CCL regexp is that > you cannot specify '?' or '#' as literal text (ie, release their > special meaning). So even if there is now allowed only to be a > single digit after '?', while the spec is being changed is it worth > allowing double quotes ('"') to be used to release special chars > anyway? This would allow 'find all terms starting with "#"'. > At present, you cannot do this with the CCL regexp. Normally > regexp's have release mechanism ( \ for regexp-1 I believe). > CCL uses " as a release mechanism so seemed the natural thing > to use in the CCL regexp (rather than \ which in CCL has no > special meaning). > > It seems an oversight not to allow searching for serial numbers etc > using patterns. > > #41434 > #53423 > > If people have to change their CCL regexp implementation anyway, > I would rather do both changes at the same time and make it possible > to search for all possible characters. > > I wonder also if the Z39.58/CCL regexp attribute needs to be renamed > to indicate that it no longer conforms to CCL. I don't actually have > a copy of Z39.58, but if its anything like the ISO version of CCL > the spec is so woolly that it isn't funny! The formal grammar is > given by examples only, and the examples contradict themselves > in places! (Mind you, the copy I have of ISO8777 is pretty old now > so maybe its been improved.) Not stressed, just thought it was the > correct time to at least ask the question. > > Alan > -- > Alan Kent (mailto:ajk@mds.rmit.edu.au, http://www.mds.rmit.edu.au/~ajk/) > Project: TeraText Technical Director, InQuirion Pty Ltd (www.inquirion.com) > Postal: Multimedia Database Systems, RMIT, GPO Box 2476V, Melbourne 3001. > Where: RMIT MDS, Bld 91, Level 3, 110 Victoria St, Carlton 3053, VIC Australia. > Phone: +61 3 9925 4114 Reception: +61 3 9925 4099 Fax: +61 3 9925 4098
Received on Tuesday, 7 May 2002 07:56:29 UTC