- From: (unknown charset) Saveen Reddy (Exchange) <saveenr@Exchange.Microsoft.com>
- Date: Wed, 28 Jan 1998 10:25:42 -0800
- To: (unknown charset) "'www-webdav-dasl@w3.org'" <www-webdav-dasl@w3.org>
- Cc: (unknown charset) "Babich, Alan" <ABabich@felix.filenet.com>
> -----Original Message----- > From: Babich, Alan [mailto:ABabich@filenet.com] > Sent: Tuesday, January 27, 1998 5:50 PM > To: 'www-webdav-dasl@w3.org' > Cc: Babich, Alan > Subject: Comments on requirements > > > (1) "3.2.3 Paged Search Results" > > My intuition on this topic is that this is probably not a good > idea. Because there are typically dozens to thousands of clients > per server, the bulk of the CPU power and disk space is on the > client side of the network. That is where scrolling around in the > answer set should happen. Putting it on the server makes > the overall architecture inefficient by wasting precious > server resources. I don't like inefficient architectures, > especially in large distributed systems, because then > lots of people suffer all at once. > > The client must be able to ask for the "next N results" > to make one sequential pass through the answer set. The > client side can cache the whole thing and allow the user > to scroll around in it at will. > > What first occurs to me is to make > the protocol as stateless on the server side as possible. > If the server is stateless, server crashes can be made > transparent to the client. NFS is that way, for example. > When they take the my server down and reboot it for backup, > my diskless workstation hangs. When they are all done, > my diskless workstation simply resumes without > missing a beat. > > Another advantage of the server being stateless is that > it simplifies things. > As you point out, paged search results have their own set of complications. I do believe, situations in which they are desireable (but this leads to the eternal question of where to place the "smarts": client or server?). Paged search results are not, in my opinion, of enough value to be in a basic search protocol because of some of the complications -- but in anticipation of it, I think the basic protocol should merely make it possible to do. > (2) "3.3.1 Search Scope > It must be possible for the client to specify a number of > different, unrelated URIs over which the search is to range." > > I agree with subsequent comments in the e-mail thread > that these "unrelated URI's" should be actually be related > by being on the same server. The client side could use > multi-threading to query multiple different servers > in parallel. > > (3) " 3.1.1 Boolean Expressions > It must be possible to use Boolean operators > (AND, OR, NOT) in the search criteria." > > Yes. Boolean expressions are a good thing. Personally, I like > them a lot. However, traditional Boolean logic has only two > truth values. I like the simplicity of that, but when it comes to > queries, that probably isn't enough truth values. You probably > need three: TRUE, FALSE, and UNKNOWN. Furthermore, if you allow > arithmetic expressions or string expressions, you will probably > need to add another value for numbers, UNDEFINED, and another > value for strings, UNDEFINED. Let me try to motivate this. > > Suppose you allowed arithmetic expressions in your query conditions. > Suppose division was one of the arithmetic operators you > defined. Then you might have some condition like "X/Y > 3". > Where X and Y are some properties. That's all well and good > until you encounter a resource where Y = 0. The way out of this > problem is to (1) let X/Y equal the UNDEFINED number you added > to the set of all numbers. (2) Define any arithmetic expression > with an variable or constant equal to UNDEFINED to evaluate > to UNDEFINED. (3) Define the relational operators > (>, >=, <, <=, = !=) to evaluate to UNKNOWN if either operand > is UNDEFINED. And,(4)define the truth tables > for AND, OR, and NOT such that the resulting truth > value is UNKNOWN if one of the operands is not well defined > AND it matters to the final outcome. Otherwise, the operand > that is not well defined doesn't matter to the final outcome, > and the truth value is what you would expect. For example, > for OR (logical inclusive OR), in "X OR Y", if X is TRUE, > you don't care whether Y is well defined or not, you > consider the value of "X OR Y" to be TRUE. Similarly, if > X were FALSE, the value of "X OR Y" is the value of Y, > be it TRUE, FALSE, UNKNOWN. Similarly, for "A AND B", > if A is FALSE, the expression evaluates to FALSE regardless > of the value of B. If A is TRUE, the expression has the > value of B. For the NOT operator, NOT TRUE is FALSE, > NOT FALSE is TRUE, and NOT UNDEFINED is UNDEFINED. > > You might think you avoid this problem if you don't allow > arithmetic or string expressions, but that turns out not > to be the case. You can not escape the null value > problem. Suppose you are querying a set of resources > with the condition that "Author=Joe OR Size>10". > What about the resources in the set of resources that > have no value for the property "Author"? Are they > included in the result set or not? With the scheme > I have given above, the answer is clear. > > The above scheme is not new to anyone familiar with > SQL or DMA. It is simply ANSI standard SQL three valued > logic. I suggest that you probably want to use it. > It solves the problems, it is standard, and it is widely > adopted. > > I also suggest that you probably want operators like > "A = NULL" or "A IS NULL" to provides client with > a tool to help deal with the null value problem more > effectively. > Good point -- I'll add text about these scenarios. > (4) "3.4.2 Extensible Query Syntax > DASL extensions must support the extensible use of > alternate query syntax." > > This is a really interesting topic. I don't have time to > go into any depth here, but let me at least say this. > It seems to me that any software related thing that > is not dead gets enhanced over time, so you had better > plan for it. Also, I believe you need to discover what > capabilities are provided by the servers, and you need > to deal gracefully with capabilities you don't understand. > I would observe that three valued logic may be of use > in this regard, since we found it to be useful in DMA. > For example, if you design it right, you might be able to > execute a query and return meaningful result even if > there is an operator the server doesn't understand. > Of course, you need an invariant way to distinguish > operators and group the operands that they apply to > in a recursive, i.e., hierarchical, manner. The use > of infix, postfix, or prefix notation and some type > of bracketing or parentheses is one way to approach > this. This overall organization of the syntax that > allows parsing out operators and their operands > can never be allowed to change. > > (5) My intuition is that you probably want a simple > but somewhat general capabilities mechanism to describe > the query capabilities of a server. > > Alan Babich > Thanks, Saveen
Received on Wednesday, 28 January 1998 13:25:59 UTC