W3C home > Mailing lists > Public > www-xml-schema-comments@w3.org > April to June 2002

Non-deterministic or wrong partition?

From: Ashok Malhotra <ashokma@microsoft.com>
Date: Mon, 3 Jun 2002 14:44:58 -0700
Message-ID: <E5B814702B65CB4DA51644580E4853FB019EECB6@red-msg-12.redmond.corp.microsoft.com>
To: "XML Schema Comments" <www-xml-schema-comments@w3.org>
Despite the fact that multiple non-normative portions of the spec make
it clear that ambiguous content-models were not intended to be allowed,
the language used in the normative parts of xml-schema (structures)
allows for ambiguous/non-deterministic content-models. 

The 'Unique Particle Attribution' constraint combined with section 3.8.4
_would_ require non-ambiguous content-models, were it not for the
interaction of minOccurs/maxOccurs with these rules.  

There are 2 relatively simple fixes that I can think of:
1) make it clear that minOccurs/maxOccurs are just syntactic sugar, and
that the 'Unique Particle Attribution' constraint should not be impacted
by this sugar.
2) change section 3.8.4 to indicate that the partitioning of the content
must be possible based only on the current position in the content-model
and the name of the next element.  (I.e. make it explicit that ambiguity
is not allowed.)


p.s. here is an example:
--- t.xsd
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" >
  <xs:element name="root">
    <xs:complexType>
      <xs:sequence minOccurs="2" maxOccurs="2">
        <xs:element name="a" minOccurs="2" maxOccurs = "unbounded"/>
        <xs:element name="b" minOccurs="0"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>

---- t.xml
<?xml version="1.0"?>
<root><a/><a/><a/><a/></root>

---- t2.xml
<?xml version="1.0"?>
<root><a/><a/><a/><b/><a/></root>

Both 1.xml and t2.xml are valid according to the content-model, and in
both cases there is unique particle attribution, but upon having parsed
the 2nd <a/> and encountering the 3rd <a/>, it is impossible to know
whether to start a new sub-sequence or to continue the current.  For
1.xml, it is necessary to start a new sub-sequence at that point, and
for t2.xml it is necessary to hold off.


All the best, Ashok 
===========================================================
Ashok Malhotra              <mailto: ashokma@microsoft.com> 
Microsoft Corporation
212 Hessian Hills Road
Croton-On-Hudson, NY 10520 USA 
Redmond: 425-703-9462                New York: 914-271-6477 
Received on Monday, 3 June 2002 17:46:24 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Sunday, 6 December 2009 18:13:01 GMT