RE: Comments on Requirements from Saveen Reddy (Exchange) on 1998-01-18 (www-webdav-dasl@w3.org from January to March 1998)

From: Saveen Reddy (Exchange) <saveenr@Exchange.Microsoft.com>
Date: Sat, 17 Jan 1998 19:54:17 -0800
To: www-webdav-dasl@w3.org, "'Judith Slein'" <slein@wrc.xerox.com>
Message-ID: <2FBF98FC7852CF11912A00000000000107E780E1@DINO>
> ----------
> From: 	Judith Slein[SMTP:slein@wrc.xerox.com]
> Sent: 	Monday, January 12, 1998 1:14 PM
> To: 	www-webdav-dasl@w3.org
> Subject: 	Comments on Requirements
> 
> Comments on requirements:
> 
> Introduction
> 
> Introduction needs to be revised to reflect the current state of DAV
> (INDEX
> is gone, PROPFIND can be executed with a Depth header so that repeated
> invocations are not needed to traverse a naamespace). 
> 
Will do.

> What the introduction basically says (and this might be made explicit)
> is
> that DAV gives clients tools retrieving enough data to perform a
> client-side search (do all the filtering, constructing result records,
> ordering results on the client side).  What DASL wants to do is
> provide
> additional protocol elements in support of server-side search.
> 
> --------------
> 
> Search Criteria
> 
> We need more terminology really to discuss this.
> 
> Term
> Comparison Operator
> Attribute / Modifier
> 
> We want to be able to search based on properties or based on content
> or
> based on both together.
> 
> The simplest search criterion is a single term, which states that the
> value
> of a certain property has a certain relationship to a certain value.
> "The
> value of the Size property is less than 10K"  Or that the content has
> a
> certain relationship to a certain expression.  "The content contains
> 'Bill
> Smith'"
> 
> The requirements need to say something about what comparison operators
> we
> care about.  I think that 3.1.2 takes too narrow a view.  It should
> list
> all the operators we think are necessary for a simple search protocol.
> 
> In addition, we may have things to say about what sorts of expressions
> must
> be supported as arguments to the comparison operators.  Is it enough
> to
> compare to some constant, or do we have to support complex
> expressions?
> 
> In addition, we might want to support attributes modifying the
> comparison.
> "Content contains 'index'", where I want to include only exact matches
> to
> the expression, where I want a case-insensitive comparison, where I
> want to
> include grammatical variants of 'index', where I want to include
> synonyms
> of 'index', where I want to include expressions in any language that
> mean
> 'index' . . .
> 
> Once having settled what needs to be supported for individual terms,
> we can
> go on to talk about booleans for constructing complex search criteria.
> 
> 3.1.4 Whatever we say about variants, we need to say the same things
> about
> versions.
> 
> 3.1.5 - 3.1.7 I would contend that these sections apply equally well
> to
> property-based searches as to content-based searches.
> 
> 3.1.5 belongs in a more general discussion of modifiers to
> comparisons.
> 3.1.6 belongs in a more general discussion of what the arguments to
> comparison operators can be like.
> 3.1.7 belongs in a more general discussion of what comparison
> operators
> must be supported.
> 
> --------------
> 
> Sort Order
> 
> We may also want to require the protocol to let clients specify a sort
> order -- by relevance ranking, by increasing / decreasing value of
> some
> property, etc.
> 
> We might want clients to be able to specify how to compute relevance,
> or we
> might want to leave this entirely to the server.
> 
> -------------
> 
> Results
> 
> We might also want to let clients specify the maximum number of
> records to
> return, or the lowest relevance ranking to return.
> 
> 
> -------------
> 
> Narrowing a Search
> 
> We might want to allow clients to request a search on a previous
> result
> set.  This would require the server to keep state for a search
> session,
> though, so maybe it's out of scope.
> 
> "More Like This" is another useful request that requires state to be
> saved.
> 
I would intuitively agree that narrowing a search result out of scope.
At the same time, I can see that in the future this capability may
derive from whatever mechanism is used to persist the search results for
paged results. Both require state to be kept, but narrowing an existing
search just extends what you can do with the state you maintain.

> -------------
> 
> Search Query Syntax
> 
> I'm not sure what is intended by 3.4.2.  "the extensible use of
> alternate
> query syntax" seems confused.  Either you extend the DASL query syntax
> somehow, or you use an alternate syntax.  Using an alternate syntax is
> not
> extending the DASL syntax.
> 
> I can see designing the syntax so that the standard can be extended in
> the
> future.  I'm not so sure I would like to see server vendors extending
> it in
> proprietary ways, though that can't be prevented.  I don't want to see
> us
> adding complicated discovery mechanisms for finding out what
> extensions
> server vendors provide.
> 
I confused the issue with the way I described query syntax. Let me try
to re-formulate it here ... there should be a basic query syntax, that
should as a general goal be extensible. At the same time, I suspect
those with legacy systems who want to expose their data and their
querying capabilities over DAV will want the capability to use a query
language optimized for their system. I think the ability to support an
alternate query language is useful for legacy systems and could be
supported without complicated discovery mechanisms. 

> ---------------
> 
> When we state requirements we need to be clear about when we mean the
> protocol must be capable of expressing something vs. a compliant
> server
> must be able to process a request of a certain sort.  We might think
> the
> protocol must be able to express a request for the results to be
> ordered in
> a certain way, but not think that all compliants servers must be able
> to
> sort results.
> 
I agree. Not every system that can be queried against is going to
support the *ideal* set DASL searching features.

> 3.4.1 makes it sound as if you think that DASL servers must be able to
> satisfy any request that can be expressed in the DASL query syntax,
> but I
> doubt if we will realistically be able to require this.
> 
Yes, the phrasing here does leave that impression.  But, to be more
clear, I think it's safe for clients to *expect* that certain things
will work basically all of the time. For example, the ability to do a
relative comparison of a property value against a constant ("find all
documents of size greater than 5K"). But, for some more advanced
capabilities (especially in relation to content-based searching) I agree
that its probably not realistic to expect that every server will be able
to satisfy every search request expressible in the DASL syntax.

> ---------------
> 
> We might want to say something about other related standards work that
> we
> want to interoperate with.
> 
I will add a section here.

> ----------------
> 
> Broad issues that need to be decided:
> 
> Is distributed search in scope (that is, a search that requires
> responses
> from multiple servers)?  I assume from 3.3.3 that your inclination is
> to
> answer "No" to this question, which would be in line with WebDAV's
> policy.
> 
That is correct. 

> To what extent is discovery of server search capabilities in scope?
> In the
> interest of keeping things simple, I would be inclined to say that the
> only
> thing you can query a server about is what query syntaxes it supports.
> 
So far that's all I've come up with also, it's a good starting point at
least.

-Saveen

> ------------
> 
> Some useful references:
> 
> The Harvest User Manual, especially the section on querying a broker:
> 
> http://harvest.transarc.com/afs/transarc.com/public/trg/Harvest/user-m
> anual/
> user-manual.html
> 
> The STARTS protocol proposal:
> 
> http://www-db.stanford.edu/~gravano/starts.html
> 
> The help pages for any of the popular Web search engines
> 
> User manuals for any of the commercially available document management
> systems
> 
> 
>
Received on Saturday, 17 January 1998 22:52:28 UTC