Comments

I have recently been looking into the various types of XML validation
including XML Schema, RELAX, SOX, DTD and a few others. I have noticed
some deficiencies in all of them that force me to write custom
validation code. Here are a few of the pitfalls that I have found.

1. The limitation on element ordering using the <sequence></sequence> is
very limiting. This in conjuncture with the very strict limitations on
the <all></all> make it impossible to allow many documents instances to
be validated. XML has the flexiblity to allow users to specify any
elements in any order, it seems to make the best sense to allow the
validation to happen likewise. Why is the argument against allowing the
following:
<sequence>
  <element name="first" .../>
  <element name="second" .../>
</sequence>
<all>
  <element name="third or fourth" .../>
  <element name="fourth or third" .../>
</all>
<sequence>
  <element name="second to last" .../>
  <element name="last" .../>
</sequence>
This would allow the validation to make use of both sequences and all
and they could be intermixed. This seems quite logical to me. As the
validation is occuring, either an all or a sequence would be specified,
depending on which one, the document is then validated using only the
tags specifed in the sequence or all block. This would require a bit
extra work for the validation code, but the benefits would be great.

2. All attributes are either required or optional and these are the only
constraints on there structure. Is it not conceivable that a document
would want to validate a more complex operation on the attributes. For
example:
<class name="org.w3.Foobar"/> or
<class package="org.w3"/>
So, the instance of the class tag should have either the file or the
package attribute, but not both or neither (a basic XOR operation). What
about a choice operation (a mutliple XOR)? These seems like they could
be very useful.

These are the only two that have really caused me problems to date.

Brian Pontarelli

Received on Wednesday, 14 March 2001 12:20:06 UTC