[Bug 2148] R-157: Question about non-deterministic content models

http://www.w3.org/Bugs/Public/show_bug.cgi?id=2148

           Summary: R-157: Question about non-deterministic content models
           Product: XML Schema
           Version: 1.0
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: XSD Part 1: Structures
        AssignedTo: ht@w3.org
        ReportedBy: sandygao@ca.ibm.com
         QAContact: www-xml-schema-comments@w3.org


Despite the fact that multiple non-normative portions of the spec make it clear 
that ambiguous content-models were not intended to be allowed, the language 
used in the normative parts of xml-schema (structures) allows for ambiguous/non-
deterministic content-models. 

The 'Unique Particle Attribution' constraint combined with section 3.8.4 
_would_ require non-ambiguous content-models, were it not for the interaction 
of minOccurs/maxOccurs with these rules. 

There are 2 relatively simple fixes that I can think of:

make it clear that minOccurs/maxOccurs are just syntactic sugar, and that 
the 'Unique Particle Attribution' constraint should not be impacted by this 
sugar. 
change section 3.8.4 to indicate that the partitioning of the content must be 
possible based only on the current position in the content-model and the name 
of the next element. (I.e. make it explicit that ambiguity is not allowed.) 
Here is an example:

--- t.xsd
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" >
  <xs:element name="root">
    <xs:complexType>
      <xs:sequence minOccurs="2" maxOccurs="2">
        <xs:element name="a" minOccurs="2" maxOccurs = "unbounded"/>
        <xs:element name="b" minOccurs="0"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>

---- t.xml
<?xml version="1.0"?>
<root><a/><a/><a/><a/><root>


---- t2.xml
<?xml version="1.0"?>
<root><a/><a/><a/><b/><a/><root>
Both t.xml and t2.xml are valid according to the content-model, and in both 
cases there is unique particle attribution, but upon having parsed the 2nd and 
encountering the 3rd , it is impossible to know whether to start a new sub-
sequence or to continue the current. For 1.xml, it is necessary to start a new 
sub-sequence at that point, and for t2.xml it is necessary to hold off. 

See the following for more info:
http://lists.w3.org/Archives/Public/www-xml-schema-comments/2002AprJun/0129.html

Received on Monday, 12 September 2005 16:46:59 UTC