- From: Babich, Alan <ABabich@filenet.com>
- Date: Wed, 1 Jul 1998 16:45:21 -0700
- To: "'DASL'" <www-webdav-dasl@w3.org>
I don't think all the readers on the DASL mailing list are up to the same level of understanding of three valued logic and three valued elimination. Therefore, I have written this e-mail to explain what they are. Sorry, but that takes more than a few words. If you already fully understand three valued logic and three valued elimination, you can skip this e-mail (<180 lines). No proposals are discussed. No reply is necessary. THREE VALUED LOGIC Imagine you are the implementor of an ad hoc query program that runs against a DBMS of any type. Let's assume an RDBMS to make the exposition more concrete. Your query program obviously can not blow up no matter what row it is given to evaluate. Problem 1: You will discover that you have to deal with columns in the row under scan being null. For example, if the query condition is ( Author = "Smith" ), what would you do if the current row under scan does not have a value for the "Author" column (i.e., the Author column is null)? Problem 2: You will also discover that you have to deal with undefined expressions, e.g., division by zero, the square root or logarithm of a negative number, floating point number overflow, etc. For example, suppose the query condition is ( X / Y > Z ). What do you do if X, Y, Z are all non null, but the value of Y is zero? Obviously you can't take a divide by zero instruction fault and blow up your program. These are generic problems for any ad hoc query program. In particular, they are generic problems for ad hoc SQL query programs. So, the ANSI standard SQL committee has defined three valued logic to deal with such problems. If you have composed a SQL query, YOU HAVE USED three valued logic. Most people DON'T REALIZE THIS. Three valued logic is intuitive enough that it doesn't normally intrude. These are the rules of three valued logic: (1) For numbers: If a column in the arithmetic expression is NULL, its value is treated as undefined. If any undefined operation occurs (e.g., division by zero), the value of that operation is treated as undefined. If any operand of an arithmetic operator is undefined, the result is undefined. Therefore, if any subpart of an arithmetic expression is undefined, the entire arithmetic expression is treated as undefined. Note that "undefined" is a completely different concept from "null". For example, assuming X is not null, X/0 is not defined but is not null either. (2) For datatypes other than numbers, the same approach is taken as for numbers: Any undefined subexpression contaminates the whole expression and causes it to be undefined. (3) For the relational operators >, >=, <, <=, =, !=, if either or both operands are undefined, the result is the truth value UNKNOWN. Note that UNKNOWN is a completely different concept than undefined. UNKNOW is a well known truth value, and, is in fact, very precisely defined. (4) The logical operators, AND, OR, and NOT are evaluated as follows: The three truth values, TRUE, FALSE, and UNKNOWN. If any operand determines the value of the expression, then return that value, because you don't care about the values of the other operands. Otherwise, cancel the operands that don't matter, and the value is determined by what's left, if anything. Thus, for AND, if any operand is FALSE, you know that the value is FALSE. Otherwise, cancel the operands that are TRUE. If you cancel them all, the value is TRUE, Otherwise, the value is UNKNOWN (no operands are FALSE, and at least one operand is UNKNOWN). For OR, if any operand is TRUE, you know that the value is TRUE. Otherwise, cancel all operands that are FALSE. If they're all FALSE, the value is FALSE. Otherwise, the value is UNKNOWN (no operand is TRUE, and at least one operand is FALSE). For NOT, NOT TRUE is FALSE, NOT FALSE is TRUE, and NOT UNKNOWN is still UNKWNOWN. NOTE: SQL doesn't have a Boolean datatype. So, it doesn't really have literals for truth values. (Although you can test whether a Boolean expression "IS TRUE", "IS FALSE", or "IS UNDEFINED" in SQL 92). One way to materialize the Boolean truth value UNKNOWN in SQL is to choose a numeric column, X, and write ( X / 0 = X / 0 ). (5) The current row under scan is included in the result set if and only if the query condition evaluates to TRUE. (Thus, if the result is FALSE or UNKNOWN, the row under scan is not included in the result set.) THREE VALUED ELMINATION Suppose you are given the task of writing software to query across a set of disjoint heterogeneous collections. The collections support somewhat different sets of properties and operators. Yet, you wish to treat querying across the set of collections the same way as querying a single collection. In general, therefore, a query may be partially undefined relative to each collection in the set of collections. Let us supposed that the collections are picky -- they give a fatal error to a query if any property or operator is not defined in their metadata. Yet, you want queries on the set of collections to return reasonable results, and not run afoul of the errors returned by an individual collection if something is not defined relative to its metadata. What do you do? First, you advertise a merged set of metadata to represent the set of collections: You take the set union of the properties and the operators supported by the individual collections and advertise that to be the metadata of the set of collections. Second, when you are given a query that is valid against the merged metadata, you make a (possibly) massaged copy of the query and send it to each collection. (If the query is not valid against the merged metadata, you give a fatal error.) Then you merge the results, and present the merged results as a single set of results. In order to produce the massaged copies of the query, you use three valued elimination (which is an obvious extension of three valued logic) as follows. Suppose you have collections A and B. Collection A has properties Author and loan_number, and the query operators in the draft except for "contains". Collection B has properties Author and purchase_order_number, and all the query operators in the draft, including "contains". Then, the merged metadata is the properties Author, loan_number, and purchase_order_number, and all the query operators in the draft, including "contains". Suppose the query condition is Author = "Smith" OR loan_number = 12345678 Then the massaged query condition for collection A is Author = "Smith" OR loan_number = 12345678 and the massaged query condition for collection B is Author = "Smith" OR UNKNOWN Thus, three valued elimination eliminates the parts of the query that are undefined relative to a particular collection by replacing the undefined Boolean subexpressions by the truth value UNKNOWN. In this example, collection A will return all resources whose Author is "Smith" or whose loan_number is 12345678, and collection B will return all resources whose Author is "Smith", since it has no resources with the loan_number property. The two answer sets will be merged into one answer set. If the query condition were Author = "Smith" OR contains("blue moon") the query passed to collection A would be Author = "Smith" OR UNKNOWN and the query passed to collection B would be Author = "Smith" OR contains("blue moon") Alan Babich
Received on Wednesday, 1 July 1998 19:48:00 UTC