Querying WebDAV Hierarchical Properties -- Motivation from Babich, Alan on 1998-09-28 (www-webdav-dasl@w3.org from July to September 1998)

From: Babich, Alan <ABabich@filenet.com>
Date: Sun, 27 Sep 1998 21:11:14 -0700
To: "'DASL'" <www-webdav-dasl@w3.org>
Message-ID: <C3AF5E329E21D2119C4C00805F6FF58F04AF21@hq-expo2.filenet.com>
Heretofore, the discussion of "structured query"
has been cast mostly in terms of querying XML 
documents. Conceptually, this is incorrect. 
Conceptually, the WebDAV property model is 
completely distinct from the concept of an XML 
document. XML is merely the vehicle
chosen to represent the protocol on the wire. The 
WebDAV property model is defined by the WebDAV draft 
and explained by the e-mail Yaron sent out a while ago. 
The XML model is, of course, defined by the XML draft.

The DASL effort is primarily a WebDAV effort, and is
not an XML effort. DASL is not and should not be trying 
to define a way to query XML documents. Querying XML 
documents is the responsibility of an XML standards 
effort.

The confusion of the two models is quite natural 
and easy to understand, because there is such a 
good fit between the two models.
In practice, what is useful for querying one would 
probably be fairly useful for querying the other.
However, we should concentrate on the WebDAV model,
and avoid confusing it with the XML model. Semantic
disconnects otherwise result, e.g., about null 
value1's versus zero length string3's and/or empty 
XML elements. (You can say a string3 REPRESENTS
a value1, but you can NOT say a string3 IS a value1.
String1, string2, string3, value1, etc. are
defined in one of my earlier e-mails.)

The interesting properties for this discussion are 
the hierarchical ones, e.g., locks of a resource.
In order to gain insight into the problem, it is
useful to consider common practice for related
areas, e.g., SQL databases, Object Oriented databases,
and C structures.

To model locks, one could imaging a C structure
with a field that was an array of locks. The type
of the lock array elements could either be another
C structure, or a C scalar. However,
C structures aren't quite right, because the WebDAV
property model doesn't have the concept of arrays.
In WebDAV, there is no random access by array index 
to a multiply occurring hierarchical property such 
as locks of a resource. (String1's, as usual, are 
considered to be scalars, not arrays of characters.) And, 
C structures require constant array bounds. You have 
to use a pointer to an array to get variable length
arrays as fields of structures. Using pointers isn't 
ideal for WebDAV, because we don't want the concept 
of pointers goofing up the definition of the properties
or getting confused with the indirect members of
Advanced Collections.

SQL comes closer than C structures to modeling what we want. 
As far as I have been able to determine, SQL 92 is the
latest ANSI, ISO, or IEC standard on SQL. However, that 
hasn't stopped vendors from shipping denormalized RDBMS's. 
I will draw on denormalized RDBMS's as an example.

In a denormalized RDBMS, a column can be an array or a table.
We have already rejected arrays, so we must consider
a column being a table. This turns out to work very
well for WebDAV. The definition of the child table could be
embedded in the definition of the parent table, or
the child table could be defined as a top level table.
This is exactly parallel to defining fields in C structures:
The definition of field of a C structure that is itself 
a structure can be embedded in the parent structure,
or the definition of a top level C structure could be
used as the type of the structure valued field.

SQL 92 can deal with this concept of master and child 
tables easily (it's done all the time) by adding an Object
Instance ID column (OIID) to each table definition and
joining the parent and child tables on OIID. In fact,
that is effectively what is going on under the covers
in the denormalized RDBMS. The difference is that the
syntax is simpler in denormalized RDBMS's, since they hides 
the join between the parent table and the child table. 
In other words, denormalized RDBMS's save you writing, 
but typically have the same performance for nested tables.
A dot notation is natural to them: A.B.C .

This nested table model is a good fit to the WebDAV
property model, because neither has the concept
of ordering of the nested property elements or of random
access to the elements by array index.

The most natural syntax to refer to a nested hierarchical
property is the way the C language does it for
nested structures: "A.B.C". Mapping this to "nested tables", 
the dots represents joins between tables. These joins are 
explicit in SQL 92 and implicit in denormalized RDBMS's.

Object oriented databases fit well with navigation.
A dot notation is natural to them: A.B.C .

We could go one step further as some denormalized
RDBMS's have, and save writing by allowing a cast
of all the constants for a row. For example, suppose
the master table is M; the child table column
is C; the name of the table definition for C is
CDEF; and the child table has two fields, F1 and F2, an 
integer and a string1. In extended SQL we could write
    M.C.F1=3 AND M.C.F2="blue"
or, we could write
    M.C = CDEF(3, "blue")
The cast approach is valid syntax in one or more
shipping denormalized RDBMS's.

The above supplies all the preliminary motivation based on
common practice that I found necessary to devise a way to 
query WebDAV hierarchical properties in DASL. I will only 
propose the equivalent of the dot syntax -- I will NOT 
propose the "cast" operator for constants.

My actual proposal is extremely short (13 lines),
and will be in a follow up e-mail.

Alan Babich
Received on Monday, 28 September 1998 00:12:24 UTC