- From: <noah_mendelsohn@us.ibm.com>
- Date: Wed, 8 Jan 2003 01:08:21 -0500
- To: Jan Mendling <mendling@web.de>
- Cc: xmlschema-dev@w3.org
Schematron indeed has many nice properties and can do this sort of thing.
On the other hand, I'm tempted to say that what you really have here is
not an XML data model at all, but a graph model that happens to be
serialized in XML. At some point, it becomes more appropriate to have a
moderate amount of checking at the XML schema level (e.g. that each arc
has a FromId and a ToId) and then to build a schema language to constrain
your graphs. After all, it's nearly hopeless to look for generalized
graph structures such as doubly linked cycles, unless you just view
something like XSL as a Turing complete programming language and program
the checks. To do it declaratively, you'd need a graph constraint
language.
XML level schemas can't generally fully check abstractions at the next
level up. We can recognize integers, but not accurately validate prime
numbers (you can declare a named subtype of Integer and call it Prime, but
you can't express tight validation constraints...the Unique Particle
Attribution constraint does ensure that you'll know which elements and
attributes were asserted to be Prime, but you'll have to write the prime
number check yourself.) Similarly, we can validate that an attribute
value resembles a credit card number, but we can't check whether the card
was stolen (and thus invalid.) I think your example is in the grey area
at the border of what we should and should not try to do. Thanks.
------------------------------------------------------------------
Noah Mendelsohn Voice: 1-617-693-4036
IBM Corporation Fax: 1-617-693-8676
One Rogers Street
Cambridge, MA 02142
------------------------------------------------------------------
Jan Mendling <mendling@web.de>
01/06/2003 08:24 PM
To: noah_mendelsohn@us.ibm.com, xmlschema-dev@w3.org
cc:
Subject: Re: Constraints in XML Schema - Formal Language Background?
Hi Noah and the others,
I do not think that W3C XML Schema needs something like tree grammar too
much, although a relaxation of the Unique Particle Attribution Rule
forbidding nondeterministic content models would be a plus.
Currently I have a problem, which I do not know how to express with any
sort of tree grammar. Consider the following:
...
<Arc FromId="1" ToId="2"/>
<Arc FromId="2" ToId="1"/>
<Arc FromId="1" ToId="2"/>
...
I want to detect whether (1) there are other Arc elements with the @FromId
(Arc1) being equal to their @ToId (ArcX) and their @FromId (ArcX) being
equal to the @ToId (Arc1).
This can be expressed with Schematron's XPath Assertions. You could argue
that I could model my content structure in a different way, so that
grammars might capture these properties. But this is often
counterproductive in terms of readability. Therefore, I think a flexible
and user-friendly solution would be to have something like Schematron
assertions in W3C XML Schema. And as XPath as a W3C standard is involved,
I cannot imagine that there will be too much overhead in calculation.
Or am I wrong? It would be nice to have some ideas here from a formal
language point of view!
Greets, Jan
noah_mendelsohn@us.ibm.com schrieb am 07.01.03 00:24:27:
> >> you are absolutely right that the expressiveness of XML
> >> schema constraints should be improved
>
> I agree.
>
> >> and XPath seems to be a natural option.
>
> Yes, though certainly other options (Relax-like tree
> automata, something else grammar-based, etc.) should at
> least be considered before a decision is made. I agree
> that XPath is a likely good choice.
>
> > About performance: I think performance matters should
> > not guide the decision about wheter XPath-Constraints
> > should be added to the schema specification or not. If
> > performance is a matter then people can switch of
> > validation (or use only simple constraints).
>
> Here I respectfully but strongly disagree. It's
> essentially that my customers and those with whom they
> do business get consistent results when they validate a
> given document with a given schema. If they say "Well,
> it was valid with XYZ-Corp.'s high performanc processor
> but not ABC's" we've got a mess. The main reason to
> use XML is universal consistency and interop. High
> performance schema processing is very, very important
> to IBM's customers, as is consistency of semantics. I
> think we can get better co-occurrence constraints
> without sacrificing performance.
>
> ------------------------------------------------------------------
> Noah Mendelsohn Voice: 1-617-693-4036
> IBM Corporation Fax: 1-617-693-8676
> One Rogers Street
> Cambridge, MA 02142
> ------------------------------------------------------------------
>
>
--
~~~~~~~~~~~~~
~ Jan Mendling
~ Güterstr.53
~ 54295 Trier
~ 0175-1636958
~~~~~~~~~~~~~
______________________________________________________________________________
Die vCard - Ihr neues Kennzeichen - bei WEB.DE FreeMail!
http://freemail.web.de/features/?mc=021156
Received on Wednesday, 8 January 2003 01:13:06 UTC