Proposed addition to XML Schema 1.1: N from sequence

My apologies for the long document, this is a complex issue.

In a number of DTDs, the following structure is used to indicate that one or
both elements are to be included.

  <!ELEMENT example (content1 | content2 | (content1, content2))>

This structure is equally, if not more, clumsy in an XML Schema
representation:

  <xsd:element name="example">
    <xsd:choice>
      <xsd:sequence>
        <xsd:element ref="content1"/>
        <xsd:element ref="content2" minOccurs="0"/>
      </xsd:sequence>
      <xsd:element ref="content2"/>
    </xsd:choice>
  </xsd:element>

If the same constraint is required for three or more elements (for example,
at least one of four elements), the complexity of the structure increases
greatly.  I'll leave that as a exercise for the reader.

Note that the effect may be approximated in two ways that I am aware of:
	- by specifying a sequence with a minOccurs for each element of 0
the constraints do no prevent the inclusion of no content elements.
	- by specifying a choice with maxOccurs of 2 the constraints allow
repetition of either content element.
	- substitution groups may be able to provide some approximation of
the intended effect, but not without compromising the integrity of the
schema being described.

Now the rest of this message may be ignored if this form can be demonstrated
simply in another fashion.

Thus the proposal is to modify the form of the sequence, so that a similar
structure can be defined without the need for complex nesting.  Two
attributes (the names chosen arbitrarily) are added to the sequence type:
minCount and maxCount, which indicate the minimum and maximum number of
child particles respectively.


 <xs:complexType name="explicitSequence">
  <xs:annotation>
   <xs:documentation>
   This is the extended type for a sequence</xs:documentation>
  </xs:annotation>
  <xs:complexContent>
   <xs:extension base="xs:explicitGroup">
    <xs:attribute name="minCount" default="1" type="xs:nonNegativeInteger"/>
    <xs:attribute name="maxCount" default="unbounded" type="xs:allNNI"/>
   </xs:extension>
  </xs:complexContent>
 </xs:complexType>

 <xs:complexType name="simpleExplicitSequence">
  <xs:annotation>
   <xs:documentation>
   This is the extended type for a sequence</xs:documentation>
  </xs:annotation>
  <xs:complexContent>
   <xs:extension base="xs:simpleExplicitGroup">
    <xs:attribute name="minCount" default="1" type="xs:nonNegativeInteger"/>
    <xs:attribute name="maxCount" default="unbounded" type="xs:allNNI"/>
   </xs:extension>
  </xs:complexContent>
 </xs:complexType>


When instantiating a sequence, the number of particles present in the
instance document must be between minCount and maxCount.

There are a number of alternatives for the interpretation of how to count
particles in this context, either counting particle instances, or counting
multiple instances of a particle as a single count.  In my interpretation,
if the maxOccurs value for a particle is greater than one, it should still
be considered as a single particle for the purposes of the count.  For
example, the following schema and instance fragments are valid:


  <xs:element name="example">
    <xs:sequence minCount="1" maxCount="2">
      <xs:element ref="content1"/>
      <xs:element ref="content2" maxOccurs="unbounded"/>
    </xs:sequence>
  </xs:element>

  <example>
    <content1/>
    <content2/>
    <content2/>
  </example>


The only possible concern here is the ambiguity introduced by repeated
particles:

  <xs:element name="example">
    <xs:sequence minCount="1" maxCount="2">
      <xs:element ref="content1"/>
      <xs:element ref="content2"/>
      <xs:element ref="content2"/>
    </xs:sequence>
  </xs:element>

I would suggest that the above schema is considered badly formed.

The interpretation of a minOccurs of "0" is somewhat arbitrary, given that
the construct is formed so that any element may be removed.  Indeed, it
might be considered that this value is the default for any particle included
within a sequence where the value fo maxCount is less than the number of
particles.

Note that the value "unbounded" for maxCount is not particularly nice,
instead it might be preferred that a value like "all" is selected.

Note also that the derivation of simpleExplicitSequence from
simpleExplicitGroup could equally be from explicitSequence.

I also understand that counting method described may also add complication
to the particle constraints (section 3.9.6
http://www.w3.org/TR/2004/WD-xmlschema11-1-20040716/#coss-particle),
particularly 2.2.2.2.2 "The particle within which this <sequence> appears is
itself among the {particles} of a <sequence>."  Additional derivation rules
will probably be required as well.

Regards,
Martin

----
Martin Thomson
Product Architect
Nortel 

[ Address : Nortel Building, Northfields Avenue, University of Wollongong,
NSW, Australia 2500 ] 
[ Tel : +61 2 4254 7515 ] [ Fax : +61 2 4224 2801 ]
[ Email : martin.thomson@nortelnetworks.com ] 

Received on Tuesday, 25 January 2005 17:46:19 UTC