- From: Babich, Alan <ABabich@filenet.com>
- Date: Sun, 27 Sep 1998 21:11:14 -0700
- To: "'DASL'" <www-webdav-dasl@w3.org>
Heretofore, the discussion of "structured query" has been cast mostly in terms of querying XML documents. Conceptually, this is incorrect. Conceptually, the WebDAV property model is completely distinct from the concept of an XML document. XML is merely the vehicle chosen to represent the protocol on the wire. The WebDAV property model is defined by the WebDAV draft and explained by the e-mail Yaron sent out a while ago. The XML model is, of course, defined by the XML draft. The DASL effort is primarily a WebDAV effort, and is not an XML effort. DASL is not and should not be trying to define a way to query XML documents. Querying XML documents is the responsibility of an XML standards effort. The confusion of the two models is quite natural and easy to understand, because there is such a good fit between the two models. In practice, what is useful for querying one would probably be fairly useful for querying the other. However, we should concentrate on the WebDAV model, and avoid confusing it with the XML model. Semantic disconnects otherwise result, e.g., about null value1's versus zero length string3's and/or empty XML elements. (You can say a string3 REPRESENTS a value1, but you can NOT say a string3 IS a value1. String1, string2, string3, value1, etc. are defined in one of my earlier e-mails.) The interesting properties for this discussion are the hierarchical ones, e.g., locks of a resource. In order to gain insight into the problem, it is useful to consider common practice for related areas, e.g., SQL databases, Object Oriented databases, and C structures. To model locks, one could imaging a C structure with a field that was an array of locks. The type of the lock array elements could either be another C structure, or a C scalar. However, C structures aren't quite right, because the WebDAV property model doesn't have the concept of arrays. In WebDAV, there is no random access by array index to a multiply occurring hierarchical property such as locks of a resource. (String1's, as usual, are considered to be scalars, not arrays of characters.) And, C structures require constant array bounds. You have to use a pointer to an array to get variable length arrays as fields of structures. Using pointers isn't ideal for WebDAV, because we don't want the concept of pointers goofing up the definition of the properties or getting confused with the indirect members of Advanced Collections. SQL comes closer than C structures to modeling what we want. As far as I have been able to determine, SQL 92 is the latest ANSI, ISO, or IEC standard on SQL. However, that hasn't stopped vendors from shipping denormalized RDBMS's. I will draw on denormalized RDBMS's as an example. In a denormalized RDBMS, a column can be an array or a table. We have already rejected arrays, so we must consider a column being a table. This turns out to work very well for WebDAV. The definition of the child table could be embedded in the definition of the parent table, or the child table could be defined as a top level table. This is exactly parallel to defining fields in C structures: The definition of field of a C structure that is itself a structure can be embedded in the parent structure, or the definition of a top level C structure could be used as the type of the structure valued field. SQL 92 can deal with this concept of master and child tables easily (it's done all the time) by adding an Object Instance ID column (OIID) to each table definition and joining the parent and child tables on OIID. In fact, that is effectively what is going on under the covers in the denormalized RDBMS. The difference is that the syntax is simpler in denormalized RDBMS's, since they hides the join between the parent table and the child table. In other words, denormalized RDBMS's save you writing, but typically have the same performance for nested tables. A dot notation is natural to them: A.B.C . This nested table model is a good fit to the WebDAV property model, because neither has the concept of ordering of the nested property elements or of random access to the elements by array index. The most natural syntax to refer to a nested hierarchical property is the way the C language does it for nested structures: "A.B.C". Mapping this to "nested tables", the dots represents joins between tables. These joins are explicit in SQL 92 and implicit in denormalized RDBMS's. Object oriented databases fit well with navigation. A dot notation is natural to them: A.B.C . We could go one step further as some denormalized RDBMS's have, and save writing by allowing a cast of all the constants for a row. For example, suppose the master table is M; the child table column is C; the name of the table definition for C is CDEF; and the child table has two fields, F1 and F2, an integer and a string1. In extended SQL we could write M.C.F1=3 AND M.C.F2="blue" or, we could write M.C = CDEF(3, "blue") The cast approach is valid syntax in one or more shipping denormalized RDBMS's. The above supplies all the preliminary motivation based on common practice that I found necessary to devise a way to query WebDAV hierarchical properties in DASL. I will only propose the equivalent of the dot syntax -- I will NOT propose the "cast" operator for constants. My actual proposal is extremely short (13 lines), and will be in a follow up e-mail. Alan Babich
Received on Monday, 28 September 1998 00:12:24 UTC