- From: Babich, Alan <ABabich@filenet.com>
- Date: Fri, 13 Mar 1998 18:18:19 -0800
- To: "'Saveen Reddy (Exchange)'" <saveenr@Exchange.Microsoft.com>, "'www-webdav-dasl@w3.org'" <www-webdav-dasl@w3.org>
- Cc: "Babich, Alan" <ABabich@felix.filenet.com>
In this e-mail, I'd like to comment on just one part of the spec. I'll make other comments in other e-mails. I'm new to the notation and terminology, so please bear with me for a while until I get up to speed. I want to talk about the "where" component of a query. It seems to me that it could be a bit simpler and become more general in the process. First, if I understand what you've done, you've chosen prefix notation. I think that's great. There are only three approaches: prefix, infix, and postfix. I've put some thought into it, and I have convinced myself that prefix notation is the correct choice for DASL. Prefix notation allows operators that are associative and commutative (e.g., AND, addition, multiplication, etc.) to become n-ary operators, instead of being stuck as binary operators. And, you've taken advantage of this in the spec. But, more importantly, prefix notation enables the scope of an operator and its operands in the parse tree to be processed programmatically without knowing anything about the operator. (You only have to know about XML's recursive begin and end bracketing scheme.) This will be critical for applying three valued elimination when querying across multiple servers with different capabilities. You can "prune" a subtree of the "parse tree", because you know the exact extent of the parse tree that's affected by an operator. Since you've chosen prefix notation, we can forget about operator precedence (which is a great benefit). We can just define well known operators indefinitely. Here is the "where" part of the example in section 5.3: <d:where> <d:expr> <d:term> <d:gt> <d:value> <d:prop><d:getcontentlength/></d:prop> </d:value> <d:value> <d:literal>10000</d:literal> </d:value> </d:gt> </d:term> </d:expr> </d:where> This is what I propose we simplify it to: <d:where> <d:gt_integer> <d:prop_integer> <d:getcontentlength/> </d:prop_integer> <d:literal_integer>10000</d:literal_integer> </d:gt_integer> </d:where> All I've done is to eliminate the syntactic metavariables that we don't need, and introduced a bit stronger typing to simplify life for the programmer who has to write the program that digests this stuff. For example, I propose we would have gt_integer, gt_float, and gt_string relational operators that have integer, floating point, and string operands, respectively, instead of a single polymorphic gt operator that works with all types, and, worse, with mixtures of types (e.g., gt(7, "cow") ). That way, you don't have to define the promotions have to do, e.g., promoting integers to floating point, converting strings to integers, etc. The promotion stuff complicates life for the programmer writing the XML parsing stuff, and there is no corresponding benefit.) Each well known operator must be defined syntatically and semantically in the spec. Syntatically, we must specify the return value type. (First, we must specify the set of base types, of course. I'm talking about strings, integers, floating point, Booleans, and maybe datetimes, etc.) Then, we must specify the number of operands, and the type of each operand. We must also specify if the operator can safely be stntatically replaced by the "undefined" integer or the "undefined" string, or the UNKNOWN Boolean value using the rules of three valued elimination without knowing anything about what the operator is supposed to do. If the thing is not an operator, then, it is a literal or a property. I propose there be a pair of beginning and ending brackets for literals and properties of each base type, so you can tell from the syntax that whether the types are correct. Then, with my proposal, you can parse the "where" expression for correct syntax, check that the types of all operands are correct, and check that each operator has the appropriate number of operands, all without knowing anything about what the operator does. You just need the operator descriptions with the information I called out above. You can save the operator descriptions in an array of C struct's compiled into your program from just the information in the spec. There is no need for separators between the operands, and you have not proposed any, which I think is good. Because XML has begin and end brackets for everything, you know where each operand starts and ends, and when you have seen the last operator. In general, you have to do this recursively, but that's not a problem. Then, we add a sentence to the spec. that says that the first and only operator in the "where" brackets must return a Boolean result, i.e., it must be a Boolean operator (AND, OR, or NOT), or a relational operator (gt_integer, ge_integer, eq_integer, ne_integer, ls_integer, le_integer, gt_float, ge_float, eq_float, ne_float, ls_float, le_float, gt_string, ge_string, eq_string, ne_string, ls_string, le_string, etc.). I'm unfamiliar with the syntatic notation, so I'll take a stab at it in standard BNF notation. For section 5.6.2, we would have: Name: where Namespace: DAV: Purpose: Defines the search critera as a Boolean expression. Unfortunately, BNF uses angle brackets to enclose syntatic variables. So, in order to avoid confusion with XML's angle brackets, I'll use curly braces instead. The other syntatically special characters are "::=" ("is defined as") and "|" (separates alternatives. English explanations are given within double quoted strings. {where} ::= {expr} {expr} ::= {operator} {operand_list} {/operator} {operator} ::= {Boolean_operator} | {relational_operator} {Boolean_operator} ::= {AND} | {OR} | {NOT} {AND} ::= <DAV: AND> {OR} ::= <DAV: OR> {NOT} ::=<DAV: NOT> {relational_operator} ::= {integer_relational} | {float_relational} | {string_relational} | {content_relational} {integer_relational}::= <DAV: lt_integer> | <DAV: le_integer> | <DAV: eq_integer> | <DAV: ne_integer> | <DAV: ls_integer> | <DAV: le_integer {float_relational} ::= <DAV: gt_float> | <DAV: ge_float> | <DAV: eq_float> | <DAV: ne_float> | <DAV: ls_float> | <DAV: le_float> {string_relational}::= <DAV: gt_string> | <DAV: ge_string> | <DAV: eq_string> | <DAV: ne_string> | <DAV: ls_string> | <DAV: le_string> {content_relational} ::= <DAV: contains> {/operator} ::= "the same as {operator} but with a '/' after the '<' so that it is the XML ending bracket" {operand_list} ::= {empty} | {operand} | {operand_list} {operand} {empty} ::= {operand} ::= {expr} | {property} | {literal} {property} ::= {property_type} {property_value} {/property_type} {property_type} ::= <DAV:prop_integer> | <DAV:prop_float> | <DAV:prop_string> {/property_type} ::= "the same as {property_type} but with a '/' after '<' so that it is an ending XML bracket" {property_value} ::= {resource_property_name} | {intrinsic_name} {resource_property_name ::= {identifier} "this is a property of the resource, e.g., Author, Title, etc." {intrinsic_name} ::= <DAV: {identifier} /> "the spec. uses the example getcontentlength as an intrinsic" {literal} ::= {literal_type} {literal_value} {/literal_type} {/literal_type} ::= "the same as {literal_type}, but with a '/' after '<' so that it is an ending XML bracket" {literal_type} ::= <DAV: literal_integer> | <DAV: literal_float> | <DAV: literal_string> | <DAV: literal_Boolean> {literal_value} ::= {integer_literal} | {float_literal} | {string_literal} | {Boolean_literal} {integer_literal} ::= "signed or unsigned decimal integer" {float_literal} ::= "signed or unsigned floating pointer number. Contains decimal point or 'E' for exponent, or both." {string_literal} ::= "string enclosed in double quotes. A double quote within the string is represented as two adjacent quotes" {Boolean_literal} ::= TRUE | FALSE | UNKNOWN NOTE: Replacing a Boolean subtree by the Boolean literal UNKNOWN is done as necessary when performing three valued elimination. Additional restrictions not expressed in the above BNF: * The operator immediately after <DAV:where> must return a Boolean result. A resource is selected for inclusion in the result set if and only if the Boolean result for that particular resource is TRUE (i.e., the resource is not included if the result is FALSE or UNKNOWN). * The {AND} and {OR} operators take two or more operands. All the operands must be of type Boolean. The rules for {AND} are applied in the following order: (1) If any operand is FALSE, the result is FALSE. (2) If all operands are TRUE, the result is TRUE. (3) Otherwise, at least one operand has the value UNKNOWN, no operands are FALSE, and the value is UNKNOWN. The rules for {OR} are applied in the following order: (1) If any operand is TRUE, the result is TRUE. (2) If all operands are FALSE, the result is FALSE. (3) Otherwise, at least one operand has the value UNKNOWN, no operands are TRUE, and the value is UNKNOWN. * The {NOT} takes a single operand of type Boolean. The rules for {NOT} are NOT TRUE is FALSE, NOT FALSE is TRUE, and NOT UNKNOWN is UNKNOWN. * The {string_relational}, {integer_relational}, and {float_relational} relational operators take exactly two operands. The type of the two operands must match the type required by the relational operator. The relational operators return a Boolean result. If either or both operands of a relational operator are the undefined, then the result of the relational operator is the Boolean value UNKNOWN. Otherwise, the result is TRUE or FALSE depending upon the result of the comparison. * The {contains_relational} operator takes one or more string valued expressions as its operands. DISCUSSION: The operators above are just the minimal subset of the well known operators that must be implemented by all DASL implementations. These are already in the spec. I expect that there will be many more well known operators defined later. Note that this does not change the general scheme at all. Any operator can be accommodated now or later. If an implementation of an old version of the spec. encounters an operator from a newer version of the spec. that it doesn't have implemented, it can get the description of this operator at run time, and can then deal with it intelligently: (1) It can check that the operator has the proper number and type of operands, and that what takes it as an argument expects the type that the operator returns. (2) It can also find out if it can used three valued elimination to get rid of the operator. If the operator returns a string, a float, or an integer, you go up recursively to the first relational operator that is its parent, and replace the entire section of the where expression for its parent relational operator by the Boolean literal UNKNOWN. If the operator returns a Boolean, then if it is marked as "safe to eliminate", you replace its entire subsection of the where expression by UNKNOWN. Otherwise, it is marked as not safe to eliminate, and you fail the query. Alan Babich > -----Original Message----- > From: Saveen Reddy (Exchange) [SMTP:saveenr@Exchange.Microsoft.com] > Sent: March 11, 1998 1:22 PM > To: 'www-webdav-dasl@w3.org' > Subject: FW: draft-reddy-dasl-protocol-00.txt > > > I just submitted the "strawman" DASL protocol sketch as an ID. I've > included > it below for you now, and I really look forward to your comments. > > Thanks, > Saveen > > > > <<draft-reddy-dasl-protocol-00.txt>> > > > > <<draft-reddy-dasl-protocol-00.doc>> << File: > draft-reddy-dasl-protocol-00.txt >> << File: > draft-reddy-dasl-protocol-00.doc >>
Received on Friday, 13 March 1998 21:29:40 UTC