Re: The problem with XML Schema definition

Hi Valeri,

> Hopefully anyone helps me by this "trivial" problem.
> I have two kinds of possible xml documents:
>
> First---------

> <callResult type="java.util.Hashtable"
>     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>     xsi:noNamespaceSchemaLocation="http://host/schema/newSession.xsd">
>      <result type="java.util.Hashtable">
>         <STATUS>OK</STATUS>
>         <Key1>123</Key1>
>         <Key2>234</Key2>
>         <Key3>324</Key3>
>         <Key4>256</Key4>
>      </result>
>  </callResult>
>
> Second-------
>
> <callResult type="java.util.Hashtable"
>     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>     xsi:noNamespaceSchemaLocation="http://host/schema/newSession.xsd">
>      <result type="java.util.Hashtable">
>         <STATUS>INIT</STATUS>
>         <ERROR>200</ERROR>
>      </result>
> </callResult>
>
> Now the newSession.xsd schema should describe these two kinds of
> xml. Could you help me giving some good ideas?

Well, at a basic level you've got a callResult element with a type
attribute containing a result element with a type attribute, which
contains a STATUS element and either a ERROR element or a sequence of
Key1 to Key4 elements. You could use:

<xs:element name="callResult">
  <xs:complexType>
    <xs:sequence>
      <xs:element name="result">
        <xs:complexType>
          <xs:sequence>
            <xs:element name="STATUS">
              <xs:simpleType>
                <xs:restriction base="xs:token">
                  <xs:enumeration value="OK" />
                  <xs:enumeration value="INIT" />
                </xs:restriction>
              </xs:simpleType>
            </xs:element>
            <xs:choice>
              <xs:element name="ERROR" type="xs:unsignedInt" />
              <xs:sequence>
                <xs:element name="Key1" type="xs:unsignedInt" />
                <xs:element name="Key2" type="xs:unsignedInt" />
                <xs:element name="Key3" type="xs:unsignedInt" />
                <xs:element name="Key4" type="xs:unsignedInt" />
              </xs:sequence>
            </xs:choice>
          </xs:sequence>
          <xs:attribute name="type" fixed="java.util.Hashtable" />
        </xs:complexType>
      </xs:element>
    </xs:sequence>
    <xs:attribute name="type" fixed="java.util.Hashtable" />
  </xs:complexType>
</xs:element>

That's the best that you can do with XML Schema.

There are a couple of other constraints that I suspect that you want
to be able to articulate. I suspect that you want to say that if the
value of the STATUS element is OK, then you should have the KeyN
elements, but if the value of the STATUS element is INIT, then you
should have the ERROR element. This is an example of a co-occurrence
constraint, and in general XML Schema doesn't support co-occurrence
constraints. They are supported in RELAX NG or in Schematron. In
Schematron, for example, you could use the pattern:

<sch:pattern context="result">
  <sch:report test="STATUS = 'INIT' and not(ERROR)">
    If STATUS is 'INIT' then there should be an ERROR element.
  </sch:report>
  <sch:report test="STATUS = 'OK' and ERROR">
    If STATUS is 'OK' then there should not be an ERROR element.
  </sch:report>
</sch:pattern>

The other thing is that I think you probably want to allow the report
element to contain any number of KeyN elements, rather than just four.
To support that in XML Schema, I think your only course of action is
to allow the report element to hold any kind of element, and then test
that (when STATUS is 'OK') all those elements start with 'Key' and end
in a number, something that you can only do in Schematron. The XML
Schema would look like:

<xs:element name="callResult">
  <xs:complexType>
    <xs:sequence>
      <xs:element name="result">
        <xs:complexType>
          <xs:sequence>
            <xs:any processContents="lax"
                    minOccurs="2" maxOccurs="unbounded" />
          </xs:sequence>
          <xs:attribute name="type" fixed="java.util.Hashtable" />
        </xs:complexType>
      </xs:element>
    </xs:sequence>
    <xs:attribute name="type" fixed="java.util.Hashtable" />
  </xs:complexType>
</xs:element>

<xs:element name="STATUS">
  <xs:simpleType>
    <xs:restriction base="xs:token">
      <xs:enumeration value="OK" />
      <xs:enumeration value="INIT" />
    </xs:restriction>
  </xs:simpleType>
</xs:element>

<xs:element name="ERROR" type="xs:unsignedInt" />

(The schema would validate STATUS and ERROR elements against their
respective element declarations, but not attempt to validate the
content of KeyN elements.)

The Schematron rules would look something like:

<sch:pattern test="report[STATUS = 'OK']/*[not(self::STATUS)]">
  <sch:assert test="starts-with(name(), 'Key')">
    The names of the elements in the report element when STATUS is OK
    should all start with 'Key'.
  </sch:assert>
  <sch:assert test="number(substring-after(name(), 'Key'))">
    The names of the elements in the report element when STATUS is OK
    should end with a number.
  </sch:assert>
</sch:pattern>

To be honest, the fact that you're going to have to go to such lengths
to validate this XML format is an indication that it's not designed
very well. Since XML elements are intrinsically ordered, it's better
to design your language so that lists of elements of the same type
don't include numbering information in their names. You should use:

<callResult type="java.util.Hashtable"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:noNamespaceSchemaLocation="http://host/schema/newSession.xsd">
  <result type="java.util.Hashtable">
    <STATUS>OK</STATUS>
    <Key>123</Key>
    <Key>234</Key>
    <Key>324</Key>
    <Key>256</Key>
  </result>
</callResult>

instead, or if you want to keep the number there for easy access, add
it as an attribute:

<callResult type="java.util.Hashtable"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:noNamespaceSchemaLocation="http://host/schema/newSession.xsd">
  <result type="java.util.Hashtable">
    <STATUS>OK</STATUS>
    <Key n="1">123</Key>
    <Key n="2">234</Key>
    <Key n="3">324</Key>
    <Key n="4">256</Key>
  </result>
</callResult>

Cheers,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/

Received on Sunday, 12 May 2002 05:05:16 UTC