W3C home > Mailing lists > Public > xmlschema-dev@w3.org > September 2003

RE: allowing zero to unbounded elements in any order?

From: <noah_mendelsohn@us.ibm.com>
Date: Wed, 17 Sep 2003 18:12:22 -0400
To: "Dare Obasanjo" <dareo@microsoft.com>
Cc: mecase@ucdavis.edu, "XSD" <xmlschema-dev@w3.org>, xmlschema-dev-request@w3.org
Message-ID: <OFFE814CC0.652B11B5-ON85256DA4.005D6B9A@lotus.com>

Dare Obasanjo writes:

>> No. Any XML file can be validated by an XML 
>> Schema. However it is possible that there 
>> are constraints you'd like enforced on the structure
>> of your document that are not expressible in W3C XML 
>> Schema. 

Yes, exactly.  To clarify, a validation of the root element using the 
anyType will accept any Infoset (unless I'm forgetting an edge case.) 

If Mike Case is asking, can I write a schema that accepts only and exactly 
some particular subset I have in mind, no matter what the subset is?  The 
answer is clearly no, and for good reason.  For example, I cannot write a 
schema in which all the values of some attribute are restricted to prime 
numbers, as XML schema does not have the computational power to check 
primeness.   This is in fact a very plausiable use case for mathematical 
schemas, and schema chooses not to attempt it.  We decided we wanted a 
declarative language, and short of building in primes as a special case, 
or a rather elaborate declarative programming language, primeness isn't 
checkable.

Back to Michael Case's use case:  like primes, yours is an example of a 
constraint that can't exactly be expressed in schemas.   This question has 
been asked many times. The schema WG considered a fully generalized all 
group.  Fully generalized means not just counts, but thing like:

ALL(OR(A|B), ALL(A,C,D), SEQUENCE(B, E, F))

attempting such things brings great complexity, and if you're not very 
careful (or maybe even if you are) significant runtime overhead.  I don't 
think we ever seriously considered a middle ground of:

ALL(A[3], B[5-7])

in other words, exactly three As and your choice of 5-7 B's, in any order 
you like.  It's not clear to me that this makes an 80/20 cut of would be 
good markup.

Note that at the Infoset level, order always matters.  XML and the Infoset 
say they do.  So, XML says that:

<A>
        <X/>
        <Y/>
<A/>

is interestingly different from 

<A>
        <Y/>
        <X/>
<A/>

Saying:

ALL(X,Y)

in your schema says that both are valid, but the order can still be 
queried by an application.  Whether your application or programming 
framework >cares< about the difference is up to you, but XML says they are 
different, no matter what schema says.  There was for awhile a request 
from the query workgroup to put in a schema annotation saying:  "I know 
XML says order matters, but not only are all orders valid, but they mean 
the same thing to me as an application or query system."  Query later 
changed their mind and did not pursue the requirement.  I hope this 
background is helpful.

------------------------------------------------------------------
Noah Mendelsohn                              Voice: 1-617-693-4036
IBM Corporation                                Fax: 1-617-693-8676
One Rogers Street
Cambridge, MA 02142
------------------------------------------------------------------
Received on Wednesday, 17 September 2003 18:15:28 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 11 January 2011 00:14:39 GMT