- From: Mike Taylor <mike@indexdata.com>
- Date: Tue, 26 Aug 2003 11:22:47 +0100
- To: ajk@mds.rmit.edu.au
- Cc: www-zig@w3.org
> Date: Tue, 19 Aug 2003 09:15:11 +1000 > From: Alan Kent <ajk@mds.rmit.edu.au> > > > > access=title > > > comparision=any > > > format=string > > > Term=child's book-case > > > Does the second mean the title must equal 'child's' or 'book-case'? > > > > Exactly my point! :) We don't know if the client meant two strings > > or one. Thus it needs to say what it meant somehow. > > Yes, I thought that is the problem we were trying to come up with a > precise spec to address. We know there is a problem - do you have a > concrete proposed solution we can put into the spec? It is none of the client's damned business whether the server treats this as two, three or four words -- or indeed seven or twenty-nine. The _sole_ purpose of the anyWords/allWords attributes is so that the client can remain in this state of blissful ignorance -- so it can say to the server, "Here's a bunch of words, pick them apart exactly as you would do if they were part of a record contributing to the index I'm searching". That's a valuable thing to be able to do: it means that the client can submit "child's book-case" without knowing or caring what the server will do with it, beyond that it will Do The Right Thing. If the client already _knows_ how it wants the string split up, it can do so itself and submit and AND or OR search. Again, the ONLY reason anyWords and allWords are useful is because the client can't, in general, know how to do this splitting. (Rob, I don't remember whether your CQL compiler has a back-end that renders out to a Type-1 equivalent format such as PQF, but if it does, you must have run into exactly this problem when trying to generate Type-1 queries using BIB-1, which doesn't have allWords/anyWords. The CQL parser can't know how to break up the multi-word search term -- it needs to pass it to the server, which does know.) > * If you specify multiple terms for format string, then the system > should "work out" what the terms are (Does this mean child's > book-case is 2 terms ("child's" and "book-case") if '2' is > specified and 3 terms ("child's" "book" "case") if '3' is > specified?) Yeuch, no. This is not only undesirable, it also doesn't work. There are at lest two perfectly good ways to parse "child's book-case" into three words: "child's", "book", "case" and "child", "s", "book-case". (Yes, I have worked with servers configured to work the second way.) > I like trying to keep things orthogonal in the attribute types as > much as possible. Exactly! > But I am strongly of the opinion that the rules for breaking the > query string into multiple search terms should be clear in the spec. Nope. It's no-one's business but the server's how it does this. _/|_ _______________________________________________________________ /o ) \/ Mike Taylor <mike@indexdata.com> http://www.miketaylor.org.uk )_v__/\ "Football is a simple game complicated by fools" -- Kevin Keegan, quoting Bill Shankly. -- Listen to my wife's new CD of kids' music, _Child's Play_, at http://www.pipedreaming.org.uk/childsplay/
Received on Tuesday, 26 August 2003 06:23:19 UTC