Re: [xml-dev] Two Questions - on XML Schema from noah_mendelsohn@us.ibm.com on 2006-03-08 (xmlschema-dev@w3.org from March 2006)

From: <noah_mendelsohn@us.ibm.com>
Date: Wed, 8 Mar 2006 09:22:50 -0500
To: Rick Jelliffe <rjelliffe@allette.com.au>
Cc: xml-dev@lists.xml.org, xmlschema-dev@w3.org
Message-ID: <OFE1FE220B.D7A75891-ON8525712B.004E3DDA-8525712B.004EFF44@lotus.com>
Rick Jeliffe writes:

> I am out of the XSD loop nowadays, 

Indeed, and I for one miss having your direct involvement.

> but I don't expect you will see *any* 
> significant evolution
> of  XSD, apart from conceptual corrections and low-hanging fruit or to 
> align it with commercially
> significant new specification such as XQuery.  You can look 
> http://www.w3.org/TR for material on
> XML Schemas 1.1 to see the kinds of changes that are being looked at. 

Well, the workgroup is seriously considering support for co-constraints, 
with designs in the spirit of a Schematron subset very much among the 
leading candidates.   Stretching <all> groups to allow occurrence>1 is 
also under consideration.

> xsd:all is a good compromise that is better than nothing:
> 
>  * it could be regarded as syntactical sugar for a very large content 
> model, so didn't require any thought about its theoretical impact
> 
>  * the assumption was that people would implement everything using an 
> FSM (the rewriting technique used by TREX/RELAX NG wasn't well known) or 

> that the obvious optimization for xsd:all might be put in
> 
>  * allowing unconstrained cardinality would easily lead to combinatorial 

> explosion in the FSM

The current thinking, at least among many WG members, is that <all> groups 
in their current forms are not usually implemented with the sort of FSM's 
that would exhibit combinatorial explosion.  Given that all groups are 
relatively separate from <sequence> and <choice>, there are 
implementations in which simple counters can be used to ensure that the 
number of elements seen meets the constraint. 

There is also discussion of more general support, in which <all> can be 
mixed with <sequence> and <choice>;  in that case, we do have to be very 
careful to understand the impact on the more general FSM and on 
restriction and extension.  For those reasons, I personally am nervous 
about proposing to mix <all> with other model groups, though we have had 
requests from some users.  Support for maxOccurs>1 may well make it into 
the working drafts for Schema 1.1.

Now, if you're speculating on what industry reaction to such a proposal 
would be, that's a different question.  We certainly have had repeated 
requests from users for such function.
 
>  * something like SGML's & operator or TREX/RELAX NG's interleave 
> operator were not in
>   any of the source schema languages that largely determined XSD from 
> the outset: I think some 
>   working group members had a feeling that the WG would never finish if 
> it deviated from that
>   input set too much (as far as stuctures are concerned.)
> 
> > <>  2) What is the rationale for disallowing choices between 
> > attributes in XML Schema ? Please note that this scenario has had 
> > repercussions on other specifications as well. [I am referring to WSDL 

> > specifications that have this requirement, but have ended up with a 
> > more loose definition for entites due to this restriction].
> >   A simple example is the "type" and "element" attribute information 
> > items on the "part" element. These attributes are mutually exclusive, 
> > but since there are no options in XML Schema to capture this scenario, 

> > they have solved this by making both of them as minOccurs="0". 
> 
> When XSD started, there was I think no schema language* that allowed 
> these kinds of co-occurrence constraints. The usefulness of such a 
> feature was well outside the experience of people in the WG from a 
> non-document background.  (The same was true of the RELAX breakaways 
> too: it was not
> obvious to people that co-occurrence constraints fitted in to grammars.)
> 
> The problem is that even if you extend grammars to cope with attributes, 

> you still don't get enough. There is no reason why all the important 
> relations and constraints in some arbitrary database graph can be 
> modelled (or expressed) well or thoroughly using a regular grammar, even 

> a grammar extended with attributes.  Given this fundamental limitation 
> in grammars, there will always be the need for some extra layer, even if 

> a slight one, to express these other constraints.
> 
> Indeed, I fully expect that ultimately XSD will be used for datatyping 
> and basic containment relationships only (i.e. "data dictionary" or 
> "basic ontologies":  people will find it simpler and better layered to 
> handle order, occurrence, co-occurrence and link constraints using 
> paths, which are better suited for this. The lack of co-occurrence 
> constraints in XSD is only a problem because 1) You need to express 
> those constraints, 2) XSD doesn't support them, and 3) You don't have in 

> place a simple layer that can support them.

As I noted above, there are serious discussions underway right now about 
including XPath-based co-occurrence constraints in schema 1.1.   As with 
the current use of Schematron in appinfo, these would be additional 
constraints:  to be valid, content would have to satisfy both the content 
model grammar and the XPath based constraints.   There are several 
proposals as to exactly how the constraints would be expressed.  The ones 
I believe closest to Schematron involve XPath predicates that would have 
to resolve as true/false for the content to be valid per the type.  There 
are also proposals from Fabio Vitale to use such predicates in selecting a 
type.  So, no guarantee that anything will be proposed, but there is 
certainly a chance.  We get requests for this function almost daily.

--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------
Received on Wednesday, 8 March 2006 14:23:21 UTC