- From: Mike Taylor <mike@indexdata.com>
- Date: Tue, 15 Jul 2003 22:07:46 +0100
- To: rden@loc.gov
- CC: www-zig@w3.org
> Date: Tue, 15 Jul 2003 12:19:02 -0400 > From: Ray Denenberg <rden@loc.gov> > > But why do we need to introduce 'words' and 'complete value' as new > Format/structure values? Is it because moving these three out > leaves Format/structure empty and we feel compelled to have a > non-empty set? No, Ray, it's exactly the other way round. The fundamental weakness that Alan spotted in the AA is that there's no way to say of a term either "this is a word" or "this is the whole value I want to search for". So you can't communicate the difference between a search for the book whole title is _Jaws_ and one for all books with the word "jaws" somewhere in the title. (You'll remember that Ralph discussed this distinction very clearly on the ZNG list a while back, when we were trying to figure out CQL.) The motivation of Alan's proposal is to fix that bug by adding the new "word" and "complete value" attributes to the utility set. It seemed to him (and no-one seems to have disagreed) that where they belong is in the Format/Structure type. Then the question arises of what to do with the existing F/S attributes, "all words", "any words" and "adjacent words". Alan's suggestion was that they fit nicely into the Comparison type -- and, again, no-one's disagreed. > If the comparison is 'all words', 'any words', or 'adjacent words' > then a structure of 'words' is implied, and the Format/structure > value is redundant. Is there another case where we need > Format/structure? It's silly to indicate the you want to find all books with "jaws" in the title by saying that you want to find titles containing _all_ the words "jaws". That's not an example of "Say what you mean, simply and directly" (Rule Number One in Kernighan & Pike's _Elements of Programming Style_), and surely such a failure of transparency utterly compromises all that we set out to achieve in the AA. It needs fixing. Likewise, defaulting to "complete value" is extremely counter-intuitive. We certainly never nailed down that this is what absence of a Format/Structure attribute means; and I would have argued against that meaning if it had even been articulated, conflicting as it does with many (many most?) implementations' behaviour. These things should be explicit. Alan also points out that when SCANning a title index, you need a way to specificy whether you're scanning for whole titles or for words included in titles. _/|_ _______________________________________________________________ /o ) \/ Mike Taylor <mike@indexdata.com> http://www.miketaylor.org.uk )_v__/\ If two decades of commercial programming have taught me anything, it's NEVER to trust dual CPUs, "uninterruptible" power supplies or RAID disks. -- Listen to my wife's new CD of kids' music, _Child's Play_, at http://www.pipedreaming.org.uk/childsplay/
Received on Tuesday, 15 July 2003 17:08:23 UTC