W3C home > Mailing lists > Public > xmlschema-dev@w3.org > January 2001

Re: Mixed content and simpleType

From: Michael Anderson <michael@research.canon.com.au>
Date: Tue, 16 Jan 2001 15:50:12 +1100
Message-ID: <3A63D304.EE3671CD@research.canon.com.au>
To: "Henry S. Thompson" <ht@cogsci.ed.ac.uk>, Bruno Chatel <bcha@chadocs.com>
Cc: xmlschema-dev@w3.org

"Henry S. Thompson" wrote:

> "Bruno Chatel" <bcha@chadocs.com> writes:
>
> > Hello,
> >
> > If i  use mixed content in a Schema :
> > <xsd:complexType name="cType" mixed="true">
> >   <xsd:choice  minOccurs="0" maxOccurs="unbounded">
> >      <xsd:element ref="e1"/>
> >      <xsd:element ref="e2"/>
> >    </xsd:choice>
> >  </xsd:complexType>
> >
> > It means that I may have some content and
> > elements e1 and/or e2  in my element of type cType.
> > It seems that there is no more informations on the
> > content type.
> >
> > Is there a way to design that the content in cType has
> > a type (simpleType) ?
>
> No.  There has been discussion about this, but a lot of uncertainty
> about exactly what character sequence the type would apply to.
> Consider this case:
>
> <wrap><e2>T</e2>his is <e1>a</e1>
> forced example.
> </wrap>
>
> There are at least three possible stories about what a simpleType
> would apply to:
>
>   1) 'This is a forced example. '
>   2) 'his is '; ' forced example. '
>   3) 'T'; 'his is '; 'a'; ' forced example. '
>
> Which did _you_ have in mind?

Why is the uncertainty around these three possible stories?  I would have
thought in the above example that the character sequences contained by <e2> and
<e1> must conform to the simpleType defined by <xsd:element ref="e1"/> and
<xsd:element ref="e2"> respectively.  This leaves only story (2).  The only
uncertainty I see ( possibly naively ) is:
Case 1) 'his is '; 'forced example. '
Case 2) 'his is forced example. '
Of these two cases, Case 1 (henry's 2) is what _I_ had in mind, but this is
probably because I deal with SAX which passes me the character sequences in two
chunks.  I also use DOM, which allows the children of an Element to be other
Elements, Text and/or Comment Nodes, so I preserve the Text character sequences
as two separate Nodes.


I must admit that I thought one could do this by extending a simpleType using
the complexContent. Ie similar to the internationalPrice element in the Primer,
except use a complexContent instead of simpleContent

<xsd:element name="internationalPrice">
  <xsd:complexType>
    <xsd:complexContent>
      <xsd:extension base="xsd:decimal">                     <!-- Note this line
-->
        <xsd:sequence>
          <xsd:element ref="DateToUseForConversion" />
        </xsd:sequence>
        <xsd:attribute name="currency" type="xsd:string"/>
      </xsd:extension>
    </xsd:complexContent>
  </xsd:complexType>
</xsd:element>

But after the previous emails, I investigated and found in the Structures spec

"Schema Representation Constraint: Complex Type Definition Representation OK
1.1 If the complexContent alternative is chosen, the type definition resolved to
by the normalized value of the base [attribute] must be a complex type
definition;"

But I can get around this by changing the above base to "myNS:myDecimal" where
myDecimal is a complex type definition
<xsd:complexType name="myDecimal">
  <xsd:simpleContent>
    <xsd:extension base="xsd:decimal"/>
  </xsd:simpleContent>
</xsd:complexType>


 This is a fudge around the problem, but it gains you little as processors are
assuming this can't be done anyway.  XML Spy successfully validates an instance
with the character sequence in <internationalPrice> being non-decimal.  And XSV
is fine with the schema, but crashes on the instance regardless of whether the
character sequence in <internationalPrice> is decimal or not (so I assume it
doesn't like the dodgey content model created by extension.

So I suspect that this is just a loophole in the specs (or I've missed it
somewhere), and the above mentioned clause should probably add:
1.1 If the complexContent alternative is chosen, the type definition of all
ancestors resolved to by the normalized value of the base [ attribute], must be
a complex type definition.

This problem of needing to check ancestors also confuses me with areas like
final.  In this case the specification does state that if the block [attribute]
= "extension" then this "prevents further derivations by extension" [ Section
3.4 Structures ].  But later [ Section 4.3.3 Structures ], the {final} property
is a set corresponding "to the normalized value of the block [attribute]", with
no mention of ancestor complexTypes.  So if I extend a complexType that had
restricted a complexType that had a block="extension", then is this allowed?

Lastly, returning back to the text case, a mate of mine pointed out that by
allowing the text content to be constrained by a simpleType, would need to
involve further constraints that dealt with the mixed content concept. Ie How
many element separated text sequences are we allowed?  Can we have each sequence
conform to a different simpleType?  Is there some order that these sequences
must appear in?  The answer is, these constraints are already written in the
specifications when one wraps the text sequences into an element.

mick.
Received on Monday, 15 January 2001 23:50:31 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 11 January 2011 00:14:19 GMT