- From: Mike Taylor <mike@indexdata.com>
- Date: Tue, 26 Aug 2003 16:13:48 +0100
- To: azaroth@liverpool.ac.uk
- Cc: www-zig@w3.org
> Date: Tue, 26 Aug 2003 15:32:45 +0100 (BST) > From: Robert Sanderson <azaroth@liverpool.ac.uk> > > > The _sole_ purpose of the anyWords/allWords attributes is so that > > the client [...] > > Which is great for Words, but potentially less great for other > formats, in particular 'string' or 'exact' or whatever you want to > call it. Are you referring to Alan's concept of completeTerm? allWords and anyWords (or allTerms and anyTerm if you prefer) are meaningless when used against a term of this type. > > (Rob, I don't remember whether your CQL compiler has a back-end > > that renders out to a Type-1 equivalent format such as PQF, but if > > it does, you must have run into exactly this problem when trying > > to generate Type-1 queries using BIB-1, which doesn't have > > allWords/anyWords. The CQL parser can't know how to break up the > > multi-word search term -- it needs to pass it to the server, which > > does know.) > > Yep. It splits it up with a convenient space separation. ... which is wrong (although an acceptable hack in the current state of things). If the server indexes "yellow book-case" as two words and I search for allWords "yellow book-case", then your CQL parser will submit an AND search that CAN NOT succeed against that index (because the server treats hyphens as word breaks). That's why you need allWords/anyWords attributes that leave the server to do the parsing. > The current definition of 'string' is insufficient for use with any > as there's no way to distinguish individual strings within the Term. Quite! Because only the server knows or _can_ know! > I think it would be solved by one of the following: > > * A numberOfTerms attribute with values of null/single/multiple/unknown. > Then at least the server will know that the term should be split up > somehow or not, and do what it thinks is appropriate. (Forget the exact > number of terms suggestion) > > * Change the definition of string to something that can be embedded within > a Term. Probably by saying the ""s are used to delineate a single > string term within the query term and that non special " characters must > be escaped with a \ These don't help at all. They don't solve the problem, and the "problem" doesn't need solving in the first place. _/|_ _______________________________________________________________ /o ) \/ Mike Taylor <mike@indexdata.com> http://www.miketaylor.org.uk )_v__/\ "White: a blank page or canvas. The challenge: bring order to the whole, through design, composition, tension, balance, light and harmony" -- Steven Sondheim, "Sunday in the Park with George" -- Listen to my wife's new CD of kids' music, _Child's Play_, at http://www.pipedreaming.org.uk/childsplay/
Received on Tuesday, 26 August 2003 11:14:48 UTC