Re: Closure for XML Schemas

Thanks for the reply. I just had a few remarks for your perusal.
I should accept that I have studied regular tree languages, and various
subclasses of it in some detail.  RELAX can express any regular tree
language, whereas XML Schema can express onlya particular subclass.

On 8 Dec 2000, Henry S. Thompson wrote:

> > I had another question now -- will there be a definition of closure
> > properties under boolean set operations, such as union, intersection and
> > difference for XML Schema? I believe, XML Schema will not be closed under
> > union and difference. It should not be surprising that most query
> > operations also will not be closed under XML Schema. I believe one of the
> > usage scenarios of XML Schema is for query?
> 
> It's clear that this isn't possible without a major change to the XML
> Schema design, which depends on the Unique Attribution constraint in a
> number of ways.  This means that the union of two types may be
> undefined, e.g.
> 
> (a?,b*) U (b*,c)

I am not exactly sure what is the big holdup against accepting full
fledged regular tree grammars. As far as the specification goes, the only
sections to be modified, I believe are Sections 3.7 and 5.7 in the
Structures part.

I believe annotation, keys etc can be added on top of regular tree
grammars.

If you consider implementations, I believe there are two main operations
given a document and a schema -> 
a) determine whether the document is valid with respect to the schema
b) determine the type for each element in the document.

For a regular tree grammar, the document validity can be checked in linear
time using SAX.
The type assignment can be done in linear time using DOM for a regular
tree grammar. Also we can have algorithms without backtracking.

I believe Henry Thompson has an implementation for XML Schema, others 
have other implementations. RELAX also has multiple implementations.
I am sure Henry Thompson is much more aware of the implementation issues.

> > And databases do not work without closure. Also, I do not think we can
> > obtain closure by restrictions on XML Schema. XML Schema is not closed
> > because it is much restricted and less expressive than, say RELAX.
> 
> That restriction buys both implementation simplicity and expressive
> power in other areas (annotation, keys, etc.) which were judged very
> important.

I believe the most important question is will we be able to support both?
Let me just illustrate a simple example ->
(I shall use a grammar notation for ease, which I believe most people will
be familiar with).
Consider a document with two types of books, say Book1 and Book2
and you have element declaration for book which can take either Book1 type
or Book2 type.
Suppose Book1 type has content Author1* which represent authors with only
sons.
Suppose Book2 type has content Author2* which represent authors with only
daughters.

Consider a simple query (XPath Expression) such as 
doc ("...")//book

Can you represent the result type which is
SetOfBooks -> set (Book1* + Book2*) in any way?
Another view someone might want could be (this is even more difficult)
SetOfBooks -> set (Book1*, Book2*)

From where I stand, it appears that DBs simply will not work without
closure.

In this context, I would also like to ask whether the WG is truly 
convinced that closure is something which cannot be provided? I personally
am not.

If we speak in ideal terms, "I would ideally like to see the mathematical
properties of RELAX incorporated as part of XML Schema. But these are 
ideal goals I would like, and for which I am willing to put effort, if I
can see a direction in which I can go."

> > Is this something which XML Schema will consider?, in such a case, I find
> > it better that for some purposes, we use a closed schema language. I
> > believe RELAX will be closed under all boolean set operations as well as
> > query operations (from the properties of regular tree languages). 
> > Therefore will it be possible for the Schema WG to combine forces with
> > RELAX, and define something when we require such closure properties. From
> > where I stand, it appears that RELAX is not being given any consideration,
> > which I believe it deserves.
> 
> We considered this issue long and hard, but came to a different
> conclusion than you evidently have.
>
> > I would appreciate the views of the Schema WG on this.
> 
> What you've got is my personal views:  I don't speak for the WG except 
> when empowered by explicit WG decisions.

Received on Friday, 8 December 2000 13:52:49 UTC