W3C home > Mailing lists > Public > xmlschema-dev@w3.org > July 2002

Re: Re xsd:string not validly derived

From: Jeni Tennison <jeni@jenitennison.com>
Date: Sun, 14 Jul 2002 12:34:04 +0100
Message-ID: <43690589555.20020714123404@jenitennison.com>
To: xmlschema-dev@w3.org, "Jaikrishnan Pillai" <jaikrishnan.pillai@vordel.com>

Hi JK,

> Well I was writing a schema for the state to either integer or
> string...But as a matter of fact I came down to check how to make it
> work with string itself and move on to add the different
> cases..Infact my orginal schema was like
>
> <xsd:simpleType>
>   <xsd:restriction>
>    <xsd:simpleType>
>      <xsd:union memberTypes='xsd:integer xsd:string'/>
>    </xsd:simpleType>
>    <xsd:pattern value='1'/>
>    <xsd:pattern value='2'/>
>    <xsd:pattern value='(A|ZK)'/>
>   </xsd:restriction>
>  </xsd:simpleType>

You should use xsd:enumeration rather than patterns that simply
enumerate the possible values. In other words, this would be better
as:

  <xsd:simpleType>
    <xsd:restriction>
      <xsd:simpleType>
        <xsd:union memberTypes="xsd:integer xsd:string" />
      </xsd:simpleType>
      <xsd:enumeration value="1" />
      <xsd:enumeration value="2" />
      <xsd:enumeration value="A" />
      <xsd:enumeration value="ZK" />
    </xsd:restriction>
  </xsd:simpleType>

The reason this is better is that it's easier for programs such as
authoring tools to pull out the possible values of the simple type and
use them to assist authoring (e.g. to supply a drop-down list that
enables the writer to choose one of '1', '2', 'A' or 'ZK'). Also, for
the validator, I imagine it's slightly easier (quicker) to test
equality between values (which you can do with an enumeration) than it
is to test against a regular expression (which you have to do if you
specify the options as a pattern).
  
> But then I had to add the soap encoding attribute too and that is
> when I couldn't figurre out a way to do it...

You were doing that fine in the example schema you sent. Since the
element needs an attribute, it has to be a complex type; since it has
only text as its content, it has to have simple content, so you want
to extend the simple type that you already have to add an attribute to
it:

  <xsd:complexType>
    <xsd:simpleContent>
      <xsd:extension>
        <xsd:simpleType>
          <xsd:restriction>
            <xsd:simpleType>
              <xsd:union memberTypes="xsd:integer xsd:string" />
            </xsd:simpleType>
            <xsd:enumeration value="1" />
            <xsd:enumeration value="2" />
            <xsd:enumeration value="A" />
            <xsd:enumeration value="ZK" />
          </xsd:restriction>
        </xsd:simpleType>
        <xsd:attribute ref="soapEnv:encodingStyle" />
      </xsd:extension>
    </xsd:simpleContent>
  </xsd:complexType>

But if you do this, you cannot say that the state element in the
instance is a "xsd:string" because it's not -- it's a complex type (an
anonymous one since you've nested the definition within the element
declaration for the state element). As I said before, I don't know why
you're trying to use xsi:type to say that the state element's content
is a xsd:string anyway -- the validator knows that already by looking
at the value that it contains and the type declared for the element in
the schema.

If you want to use xsi:type within the instance document to indicate
whether the content of the state element is a string or an integer,
then I think you're out of luck. The only route that I can see would
be to create a complex type hierarchy with a base type that allows
both strings and integers and then create two derived types that allow
only integers and only strings respectively. For example:

<xsd:simpleType name="string-values">
  <xsd:restriction base="xsd:string">
    <xsd:enumeration value="A" />
    <xsd:enumeration value="ZK" />
  </xsd:restriction>
</xsd:simpleType>

<xsd:simpleType name="integer-values">
  <xsd:restriction base="xsd:integer">
    <xsd:enumeration value="1" />
    <xsd:enumeration value="2" />
  </xsd:restriction>
</xsd:simpleType>

<xsd:simpleType name="string-or-integer-values">
  <xsd:union memberTypes="integer-values string-values" />
</xsd:simpleType>

<xsd:complexType name="string-or-integer-with-encodingStyle">
  <xsd:simpleContent>
    <xsd:extension base="string-or-integer-values">
      <xsd:attribute ref="soapEnv:encodingStyle" />
    </xsd:extension>
  </xsd:simpleContent>
</xsd:complexType>

<xsd:complexType name="string-with-encodingStyle">
  <xsd:simpleContent>
    <xsd:restriction base="string-or-integer-with-encodingStyle">
      <xsd:simpleType>
        <xsd:restriction base="string-values" />
      </xsd:simpleType>
    </xsd:restriction>
  </xsd:simpleContent>
</xsd:complexType>

<xsd:complexType name="integer-with-encodingStyle">
  <xsd:simpleContent>
    <xsd:restriction base="string-or-integer-with-encodingStyle">
      <xsd:simpleType>
        <xsd:restriction base="integer-values" />
      </xsd:simpleType>
    </xsd:restriction>
  </xsd:simpleContent>
</xsd:complexType>

You could then say that the your state element was of the type
"string-with-encodingStyle" or "integer-with-encodingStyle".

However, I don't think that the above is a legal XML Schema. It passes
XSV and Xerces-J validation; Xerces-C++ thinks the schema is OK but
complains if the state element has an encodingStyle attribute; MSXML
rejects the schema saying the "integer-with-encodingStyle" isn't a
valid restriction of "string-or-integer-with-encodingStyle". Reading
through the XML Schema Structures Rec, I think that MSXML is correct
-- for integer-with-encodingStyle to be validly derived from
string-or-integer-with-encodingStyle, the atomic simple type
definition it contains (an unrestricting restriction of
integer-values) has to be a valid restriction of the union
string-or-integer-values simple type, and atomic simple type
definitions can only be derived by restriction from other atomic (or
built-in primitive) types (according to
http://www.w3.org/TR/xmlschema-1/#cos-st-restricts).

There are ways round it, I think, but none that I can think of are
particularly pleasant or retain the correct semantics. Perhaps someone
else can come with something that I've missed.

Cheers,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/
Received on Sunday, 14 July 2002 07:34:11 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 14:55:57 UTC