Re: Qualifier Combinations in CQL (Was: Bib-2 and the DC-Lib) from Mike Taylor on 2002-04-25 (www-zig@w3.org from April 2002)

From: Mike Taylor <mike@tecc.co.uk>
Date: Thu, 25 Apr 2002 13:28:00 +0100 (BST)
To: levan@oclc.org
CC: barbara.shuh@nlc-bnc.ca, azaroth@liverpool.ac.uk, Theo.vanVeen@kb.nl, Kevin.Gladwell@bl.uk, www-zig@w3.org
Message-Id: <200204251228.NAA14137@-f>

> Date: Wed, 24 Apr 2002 09:45:02 -0400
> From: "LeVan,Ralph" <levan@oclc.org>
> 
> I think we have a fundamental difference in our understanding of the
> lessons to be learned from attribute architectures.
> 
> Claim #1.  Users think about indexes and databases think about
> indexes.  So why are we putting attribute combinations on the wire?

OK, last time into the breach.  I promise to shut up after this
contribution, as the whole subject is just too depressing for words
anyway.

First of all, even if users and databases both think about "indexes"
(which by the way is not always true), they do not in general think
about the _same_ indexes, so the "straight through" mapping you prefer
only works in the subset of cases where you have control over both
ends of the connections.  As soon as you want to across search
multiple servers -- especially those from multiple application
domains, where things are indexed differently -- you need to talk in
more abstract terms.

Suppose your client sends me a search for "kernighan" in the "foo"
index, which is defined as: access point "name", SQ "personal", FQ
"author".  If my server doesn't know about "foo", it can't do anything
but fail your search.  But if you'd sent the search in terms of its
attributes and I'd not known about (say) SQ "personal", I could still
have done you search for access point "name", FAQ "author".  Which
would get you all the records you wanted plus a small (probably zero)
amount of noise from records in which a corporate author of
"kernighan" is listed.

What happened here?  We introduced complexity, and in return we got
flexibility.  Take away the complexity, as SRW does -- rightly in some
cases -- and you also lose the flexibility.  That may be an
appropriate trade-off to make; I'd argue that for something trivial
like a web search-engine, it _is_ appropriate: people need to come
across it, use it, then go on with their lives.  But for serious IR
applications that that people invest their time in learning, I don't
think it _is_ appropriate.

So what's SRW's target audience?  Casual users of the first type or
professionals of the second?  Not the first, surely -- they already
plenty of good tools.  If you want a SOAP-based protocol for that sort
of IR Lite, then you may as well just use the Google interface, which
is bound to become some kind of de facto standard in no time.

To my mind, if SRW has a future at all, it's as a mechanism for doing
serious IR.  To do serious IR (this should come as no surprise) you
need complexity.  That's because IR is complex stuff.  No-one would
argue that _all_ of the complexity in Z39.50 Classic is truly
necessary -- some of it, with the benefit of hindsight, was a
mistake.  But plenty of it is Real Complexity that's needed to address
Really Complex Problems.  SRW needs to avoid throwing out the baby
with the bathwater.

And with that benediction, I now bow out as gracefully as I can from
this discussion.  I don't for a moment expect that anything I've said
will make the slightest bit of difference; but I do feel very very
slightly less bad for having written it.  I hope everyone else doesn't
feel worse for having read it.

	-- Bear with Sore Head.

 _/|_	 _______________________________________________________________
/o ) \/  Mike Taylor   <mike@miketaylor.org.uk>   www.miketaylor.org.uk
)_v__/\  "You're not explaining it, you're just saying it" --
	 Harvey Thompson.

Received on Thursday, 25 April 2002 08:28:02 UTC