RE: "Where" condition too general vis a vis "contains"

Jim:

I've read an e-mail from you say that said you're strongly in 
favor of supporting current practice (as am I). I believe you 
also want to keep 1.0 as simple as possible and maximize its
chances for success by limiting functionality (as do I). I 
believe that is somewhat at odds with your position on this 
issue for two reasons.

First, my belief is that implementations with dual engines 
that don't mix the conditions are a lot more common than 
implementations which allow arbitrary mixtures. 
The companies involved in the AIIM demo didn't think it was 
all that easy to arbitrarily mix the conditions. It's not just a 
question of getting it done. Performance is an issue as well, 
even for the simple case of the top level operator being AND 
with exactly two operands, one for hard property conditions, 
and one for CBR conditions. Which operand you drive the query 
from can drastically affect the performance. That is why none 
of them went beyond that for the demo.  Supporting the
mainstream of current practice means supporting restrictions 
on the use of "contains". 

Second, in 1.0 we are definitely not going to unleash all the 
functionality of all the existing and near-future implementations. 
We're not even trying to do that. The plan is to enhance the 
functionality on later releases of DASL, so any limitations
are not necessarily permanent. Supporting restrictions on 
"contains" is aligned with limiting DASL functionality for 1.0 .

But, if you really want to accelerate the advanced feature of 
arbitrarily mixing conditions into 1.0, then the best thing to 
do seems to me to be to allow the implementations to do pretty
much whatever they can do. (I'm pretty sure you'll agree with 
that.)

In order to do that in a thoroughly satisfactory manner, I
believe all we need to do is to slightly decorate the "contains"
operator in the query schema discovery response to advertise
the restriction on its use, if any. I propose the obvious three 
cases: (1) no restrictions, (2) top level query condition 
operator must be AND or OR with two operands, the first of 
which is a hard property condition, and the second of which 
is "contains", (3) the "contains" operator must be the only 
operator in the query condition. It is near zero cost to
advertise the entire operators subsection of the query schema, 
since that could be just a string constant in the search arbiter 
code of a collection, which is simply included in the response.

Alan Babich

-----Original Message-----
From: Jim Davis [mailto:jdavis@parc.xerox.com]
Sent: July 14, 1998 7:46 PM
To: Babich, Alan; 'DASL'
Subject: Re: "Where" condition too general vis a vis "contains"


At 03:57 PM 7/12/98 PDT, Babich, Alan wrote:
>However, instead of even that much generality, I would
>propose the following for simplesearch: If the "contains"
>operator is used, it is the entire "where" condition.

I strongly disagree with this proposal. First of all, while I concede
that
some current servers may benefit from this, this limitation would apply
even to those servers that are not split in the manner Alan describes,
and
so it penalizes the strong implementations for the sake of the weak.
Second, even for servers that do maintain the split, it is not hard for
them to parse the query and route the parts to the RDBDS and full text
engines.

Received on Wednesday, 15 July 1998 00:54:42 UTC