Re: Attribute Architecture proposal

> Date: Tue, 15 Jul 2003 12:19:02 -0400
> From: Ray Denenberg <rden@loc.gov>
> 
> But why do we need to introduce 'words' and 'complete value' as new
> Format/structure values?  Is it because moving these three out
> leaves Format/structure empty and we feel compelled to have a
> non-empty set?

No, Ray, it's exactly the other way round.  The fundamental weakness
that Alan spotted in the AA is that there's no way to say of a term
either "this is a word" or "this is the whole value I want to search
for".  So you can't communicate the difference between a search for
the book whole title is _Jaws_ and one for all books with the word
"jaws" somewhere in the title.  (You'll remember that Ralph discussed
this distinction very clearly on the ZNG list a while back, when we
were trying to figure out CQL.)

The motivation of Alan's proposal is to fix that bug by adding the new
"word" and "complete value" attributes to the utility set.  It seemed
to him (and no-one seems to have disagreed) that where they belong is
in the Format/Structure type.

Then the question arises of what to do with the existing F/S
attributes, "all words", "any words" and "adjacent words".  Alan's
suggestion was that they fit nicely into the Comparison type -- and,
again, no-one's disagreed.

> If the comparison is 'all words', 'any words', or 'adjacent words'
> then a structure of 'words' is implied, and the Format/structure
> value is redundant.  Is there another case where we need
> Format/structure?

It's silly to indicate the you want to find all books with "jaws" in
the title by saying that you want to find titles containing _all_ the
words "jaws".  That's not an example of "Say what you mean, simply and
directly" (Rule Number One in Kernighan & Pike's _Elements of
Programming Style_), and surely such a failure of transparency utterly
compromises all that we set out to achieve in the AA.  It needs
fixing.

Likewise, defaulting to "complete value" is extremely
counter-intuitive.  We certainly never nailed down that this is what
absence of a Format/Structure attribute means; and I would have argued
against that meaning if it had even been articulated, conflicting as
it does with many (many most?) implementations' behaviour.  These
things should be explicit.

Alan also points out that when SCANning a title index, you need a way
to specificy whether you're scanning for whole titles or for words
included in titles.

 _/|_	 _______________________________________________________________
/o ) \/  Mike Taylor  <mike@indexdata.com>  http://www.miketaylor.org.uk
)_v__/\  If two decades of commercial programming have taught me
	 anything, it's NEVER to trust dual CPUs, "uninterruptible"
	 power supplies or RAID disks.

--
Listen to my wife's new CD of kids' music, _Child's Play_, at
	http://www.pipedreaming.org.uk/childsplay/

Received on Tuesday, 15 July 2003 17:08:23 UTC