- From: <noah_mendelsohn@us.ibm.com>
- Date: Wed, 17 Sep 2003 18:12:22 -0400
- To: "Dare Obasanjo" <dareo@microsoft.com>
- Cc: mecase@ucdavis.edu, "XSD" <xmlschema-dev@w3.org>, xmlschema-dev-request@w3.org
Dare Obasanjo writes:
>> No. Any XML file can be validated by an XML
>> Schema. However it is possible that there
>> are constraints you'd like enforced on the structure
>> of your document that are not expressible in W3C XML
>> Schema.
Yes, exactly. To clarify, a validation of the root element using the
anyType will accept any Infoset (unless I'm forgetting an edge case.)
If Mike Case is asking, can I write a schema that accepts only and exactly
some particular subset I have in mind, no matter what the subset is? The
answer is clearly no, and for good reason. For example, I cannot write a
schema in which all the values of some attribute are restricted to prime
numbers, as XML schema does not have the computational power to check
primeness. This is in fact a very plausiable use case for mathematical
schemas, and schema chooses not to attempt it. We decided we wanted a
declarative language, and short of building in primes as a special case,
or a rather elaborate declarative programming language, primeness isn't
checkable.
Back to Michael Case's use case: like primes, yours is an example of a
constraint that can't exactly be expressed in schemas. This question has
been asked many times. The schema WG considered a fully generalized all
group. Fully generalized means not just counts, but thing like:
ALL(OR(A|B), ALL(A,C,D), SEQUENCE(B, E, F))
attempting such things brings great complexity, and if you're not very
careful (or maybe even if you are) significant runtime overhead. I don't
think we ever seriously considered a middle ground of:
ALL(A[3], B[5-7])
in other words, exactly three As and your choice of 5-7 B's, in any order
you like. It's not clear to me that this makes an 80/20 cut of would be
good markup.
Note that at the Infoset level, order always matters. XML and the Infoset
say they do. So, XML says that:
<A>
<X/>
<Y/>
<A/>
is interestingly different from
<A>
<Y/>
<X/>
<A/>
Saying:
ALL(X,Y)
in your schema says that both are valid, but the order can still be
queried by an application. Whether your application or programming
framework >cares< about the difference is up to you, but XML says they are
different, no matter what schema says. There was for awhile a request
from the query workgroup to put in a schema annotation saying: "I know
XML says order matters, but not only are all orders valid, but they mean
the same thing to me as an application or query system." Query later
changed their mind and did not pursue the requirement. I hope this
background is helpful.
------------------------------------------------------------------
Noah Mendelsohn Voice: 1-617-693-4036
IBM Corporation Fax: 1-617-693-8676
One Rogers Street
Cambridge, MA 02142
------------------------------------------------------------------
Received on Wednesday, 17 September 2003 18:15:28 UTC