(unknown charset) RE: Comments on requirements from (unknown charset) Saveen Reddy (Exchange) on 1998-01-28 (www-webdav-dasl@w3.org from January to March 1998)

From: (unknown charset) Saveen Reddy (Exchange) <saveenr@Exchange.Microsoft.com>
Date: Wed, 28 Jan 1998 10:25:42 -0800
To: (unknown charset) "'www-webdav-dasl@w3.org'" <www-webdav-dasl@w3.org>
Cc: (unknown charset) "Babich, Alan" <ABabich@felix.filenet.com>
Message-ID: <2FBF98FC7852CF11912A00000000000107E78193@DINO>
> -----Original Message-----
> From: Babich, Alan [mailto:ABabich@filenet.com]
> Sent: Tuesday, January 27, 1998 5:50 PM
> To: 'www-webdav-dasl@w3.org'
> Cc: Babich, Alan
> Subject: Comments on requirements
> 
> 
> (1) "3.2.3 Paged Search Results"
> 
> My intuition on this topic is that this is probably not a good
> idea. Because there are typically dozens to thousands of clients
> per server, the bulk of the CPU power and disk space is on the
> client side of the network. That is where scrolling around in the
> answer set should happen. Putting it on the server makes
> the overall architecture inefficient by wasting precious
> server resources. I don't like inefficient architectures,
> especially in large distributed systems, because then
> lots of people suffer all at once. 
> 
> The client must be able to ask for the "next N results"
> to make one sequential pass through the answer set. The
> client side can cache the whole thing and allow the user
> to scroll around in it at will. 
> 
> What first occurs to me is to make
> the protocol as stateless on the server side as possible.
> If the server is stateless, server crashes can be made
> transparent to the client. NFS is that way, for example.
> When they take the my server down and reboot it for backup,
> my diskless workstation hangs. When they are all done,
> my diskless workstation simply resumes without
> missing a beat. 
> 
> Another advantage of the server being stateless is that
> it simplifies things.
>

As you point out, paged search results have their own set of
complications. I do believe, situations in which they are desireable
(but this leads to the eternal question of where to place the "smarts":
client or server?). Paged search results are not, in my opinion, of
enough value to be in a basic search protocol because of some of the
complications -- but in anticipation of it, I think the basic protocol
should merely make it possible to do. 
 
> (2) "3.3.1 Search Scope
>      It must be possible for the client to specify a number of
>      different, unrelated URIs over which the search is to range."
> 
> I agree with subsequent comments in the e-mail thread
> that these "unrelated URI's" should be actually be related
> by being on the same server. The client side could use
> multi-threading to query multiple different servers
> in parallel.
> 
> (3) " 3.1.1 Boolean Expressions
>      It must be possible to use Boolean operators
>      (AND, OR, NOT) in the search criteria."
> 
> Yes. Boolean expressions are a good thing. Personally, I like
> them a lot. However, traditional Boolean logic has only two
> truth values. I like the simplicity of that, but when it comes to
> queries, that probably isn't enough truth values. You probably
> need three: TRUE, FALSE, and UNKNOWN. Furthermore, if you allow
> arithmetic expressions or string expressions, you will probably
> need to add another value for numbers, UNDEFINED, and another
> value for strings, UNDEFINED. Let me try to motivate this.
> 
> Suppose you allowed arithmetic expressions in your query conditions.
> Suppose division was one of the arithmetic operators you
> defined. Then you might have some condition like "X/Y > 3".
> Where X and Y are some properties. That's all well and good
> until you encounter a resource where Y = 0. The way out of this
> problem is to (1) let X/Y equal the UNDEFINED number you added
> to the set of all numbers. (2) Define any arithmetic expression
> with an variable or constant equal to UNDEFINED to evaluate
> to UNDEFINED. (3) Define the relational operators
> (>, >=, <, <=, = !=) to evaluate to UNKNOWN if either operand
> is UNDEFINED. And,(4)define the truth tables
> for AND, OR, and NOT such that the resulting truth
> value is UNKNOWN if one of the operands is not well defined
> AND it matters to the final outcome. Otherwise, the operand
> that is not well defined doesn't matter to the final outcome,
> and the truth value is what you would expect. For example,
> for OR (logical inclusive OR), in "X OR Y", if X is TRUE,
> you don't care whether Y is well defined or not, you 
> consider the value of "X OR Y" to be TRUE. Similarly, if
> X were FALSE, the value of "X OR Y" is the value of Y,
> be it TRUE, FALSE, UNKNOWN. Similarly, for "A AND B",
> if A is FALSE, the expression evaluates to FALSE regardless
> of the value of B. If A is TRUE, the expression has the
> value of B. For the NOT operator, NOT TRUE is FALSE,
> NOT FALSE is TRUE, and NOT UNDEFINED is UNDEFINED.
> 
> You might think you avoid this problem if you don't allow
> arithmetic or string expressions, but that turns out not
> to be the case. You can not escape the null value
> problem. Suppose you are querying a set of resources
> with the condition that "Author=Joe OR Size>10".
> What about the resources in the set of resources that 
> have no value for the property "Author"? Are they
> included in the result set or not? With the scheme
> I have given above, the answer is clear.
> 
> The above scheme is not new to anyone familiar with
> SQL or DMA. It is simply ANSI standard SQL three valued
> logic. I suggest that you probably want to use it. 
> It solves the problems, it is standard, and it is widely
> adopted.
> 
> I also suggest that you probably want operators like
> "A = NULL" or "A IS NULL" to provides client with
> a tool to help deal with the null value problem more
> effectively.
>

Good point -- I'll add text about these scenarios.

> (4) "3.4.2 Extensible Query Syntax
>      DASL extensions must support the extensible use of
>      alternate query syntax."
> 
> This is a really interesting topic. I don't have time to
> go into any depth here, but let me at least say this.
> It seems to me that any software related thing that
> is not dead gets enhanced over time, so you had better
> plan for it. Also, I believe you need to discover what
> capabilities are provided by the servers, and you need
> to deal gracefully with capabilities you don't understand.
> I would observe that three valued logic may be of use
> in this regard, since we found it to be useful in DMA.
> For example, if you design it right, you might be able to
> execute a query and return meaningful result even if
> there is an operator the server doesn't understand.
> Of course, you need an invariant way to distinguish
> operators and group the operands that they apply to
> in a recursive, i.e., hierarchical, manner. The use
> of infix, postfix, or prefix notation and some type
> of bracketing or parentheses is one way to approach
> this. This overall organization of the syntax that
> allows parsing out operators and their operands
> can never be allowed to change.
> 
> (5) My intuition is that you probably want a simple
> but somewhat general capabilities mechanism to describe
> the query capabilities of a server.
> 
> Alan Babich
> 

Thanks,
Saveen
Received on Wednesday, 28 January 1998 13:25:59 UTC