- From: <noah_mendelsohn@us.ibm.com>
- Date: Wed, 8 Jan 2003 01:08:21 -0500
- To: Jan Mendling <mendling@web.de>
- Cc: xmlschema-dev@w3.org
Schematron indeed has many nice properties and can do this sort of thing. On the other hand, I'm tempted to say that what you really have here is not an XML data model at all, but a graph model that happens to be serialized in XML. At some point, it becomes more appropriate to have a moderate amount of checking at the XML schema level (e.g. that each arc has a FromId and a ToId) and then to build a schema language to constrain your graphs. After all, it's nearly hopeless to look for generalized graph structures such as doubly linked cycles, unless you just view something like XSL as a Turing complete programming language and program the checks. To do it declaratively, you'd need a graph constraint language. XML level schemas can't generally fully check abstractions at the next level up. We can recognize integers, but not accurately validate prime numbers (you can declare a named subtype of Integer and call it Prime, but you can't express tight validation constraints...the Unique Particle Attribution constraint does ensure that you'll know which elements and attributes were asserted to be Prime, but you'll have to write the prime number check yourself.) Similarly, we can validate that an attribute value resembles a credit card number, but we can't check whether the card was stolen (and thus invalid.) I think your example is in the grey area at the border of what we should and should not try to do. Thanks. ------------------------------------------------------------------ Noah Mendelsohn Voice: 1-617-693-4036 IBM Corporation Fax: 1-617-693-8676 One Rogers Street Cambridge, MA 02142 ------------------------------------------------------------------ Jan Mendling <mendling@web.de> 01/06/2003 08:24 PM To: noah_mendelsohn@us.ibm.com, xmlschema-dev@w3.org cc: Subject: Re: Constraints in XML Schema - Formal Language Background? Hi Noah and the others, I do not think that W3C XML Schema needs something like tree grammar too much, although a relaxation of the Unique Particle Attribution Rule forbidding nondeterministic content models would be a plus. Currently I have a problem, which I do not know how to express with any sort of tree grammar. Consider the following: ... <Arc FromId="1" ToId="2"/> <Arc FromId="2" ToId="1"/> <Arc FromId="1" ToId="2"/> ... I want to detect whether (1) there are other Arc elements with the @FromId (Arc1) being equal to their @ToId (ArcX) and their @FromId (ArcX) being equal to the @ToId (Arc1). This can be expressed with Schematron's XPath Assertions. You could argue that I could model my content structure in a different way, so that grammars might capture these properties. But this is often counterproductive in terms of readability. Therefore, I think a flexible and user-friendly solution would be to have something like Schematron assertions in W3C XML Schema. And as XPath as a W3C standard is involved, I cannot imagine that there will be too much overhead in calculation. Or am I wrong? It would be nice to have some ideas here from a formal language point of view! Greets, Jan noah_mendelsohn@us.ibm.com schrieb am 07.01.03 00:24:27: > >> you are absolutely right that the expressiveness of XML > >> schema constraints should be improved > > I agree. > > >> and XPath seems to be a natural option. > > Yes, though certainly other options (Relax-like tree > automata, something else grammar-based, etc.) should at > least be considered before a decision is made. I agree > that XPath is a likely good choice. > > > About performance: I think performance matters should > > not guide the decision about wheter XPath-Constraints > > should be added to the schema specification or not. If > > performance is a matter then people can switch of > > validation (or use only simple constraints). > > Here I respectfully but strongly disagree. It's > essentially that my customers and those with whom they > do business get consistent results when they validate a > given document with a given schema. If they say "Well, > it was valid with XYZ-Corp.'s high performanc processor > but not ABC's" we've got a mess. The main reason to > use XML is universal consistency and interop. High > performance schema processing is very, very important > to IBM's customers, as is consistency of semantics. I > think we can get better co-occurrence constraints > without sacrificing performance. > > ------------------------------------------------------------------ > Noah Mendelsohn Voice: 1-617-693-4036 > IBM Corporation Fax: 1-617-693-8676 > One Rogers Street > Cambridge, MA 02142 > ------------------------------------------------------------------ > > -- ~~~~~~~~~~~~~ ~ Jan Mendling ~ Güterstr.53 ~ 54295 Trier ~ 0175-1636958 ~~~~~~~~~~~~~ ______________________________________________________________________________ Die vCard - Ihr neues Kennzeichen - bei WEB.DE FreeMail! http://freemail.web.de/features/?mc=021156
Received on Wednesday, 8 January 2003 01:13:06 UTC