RE: additional constraints validation variant from Mark Feblowitz on 2002-07-31 (xmlschema-dev@w3.org from July 2002)

From: Mark Feblowitz <mfeblowitz@frictionless.com>
Date: Wed, 31 Jul 2002 13:52:11 -0400
To: "'Paul Kiel'" <paul@hr-xml.org>, xmlschema-dev@w3.org
Cc: "David Connelly (E-mail)" <dconnelly@openapplications.org>, "Duane Krahn (E-mail)" <duane.krahn@irista.com>, "Satish Ramanathan (E-mail)" <Satish.Ramanathan@mro.com>, "Andrew Warren (E-mail)" <awarren@openapplications.org>, "Kurt A Kanaskie (Kurt) (E-mail)" <kkanaskie@lucent.com>, Mark Feblowitz <mfeblowitz@frictionless.com>, "Michael Rowell (E-mail)" <mrowell@openapplications.org>
Message-ID: <4DBDB4044ABED31183C000508BA0E97F040ABE84@fcpostal.frictionless.com>

Paul - 
 
Your idea is quite compelling. In fact, it was one of the many we considered
(and ultimately abandoned). I liked this approach so much I even mocked it
up myself.
 
For the unfamiliar, the problem could be summarized as follows: 
 
How does one support multiple uses of "the same" complexType, with different
minimum cardinalities in each content model?
 
The problem arises in OAGIS when we want to apply, e.g., the noun
"PurchaseOrder" in different contexts: CancelPurchaseOrder requires only
minimal, identifying PurchaseOrder content (but could contain any), and
ProcessPurchaseOrder requires most of the PurchaseOrder content. In the
former, most of the content would be optional (minOccurs="0"); in the
latter, most would be required (minOccurs="1").
 
With complexType derivation by restriction being a non-starter for us, we
were forced to come up with an alternative. What we settled on, after months
of painstaking exploration, was a "relaxed" model of all-optional content
(all element content with minOccurs="0"), with separately specified
cardinality constraints layered on via post-validation Schematron
processing. This requires two-pass validation: schema-validation and
Schematron processing. That extra step, although achievable using standard
technologies (schema-validating parser plus XSLT processor), offends some
sensibilities and raises efficiency concerns. (Some efficiency concerns will
be addressed when XSLT processors facilitate schema-validation plus
transformation, which should be very soon).
 
Paul's suggested approach is a development-time alternative: rather than
performing interchange-time constraint checking, he proposes that the same
cardinality constraints be used to guide a transformation of the relaxed
schema into other, cardinality-constrained schemas. 
 
The benefits are obvious: no extra runtime machinery is required - only a
schema-validating parser is necessary to check for correct structure, types
and cardinalities.
 
The reasons we rejected it had to do with complexity: first, it's complex to
manage multiple schemas and to link to "the right" generated schema. That's
not so bad, but can be daunting. But the real difficulty comes in managing
the fan-out in the face of further extensions. 
 
Let's say I have a PurchaseOrder used in 6 different contexts. I generate 6
variants from the one relaxed model plus the 6 sets of separately-specified
cardinality constraints. Now I derive an extension to PurchaseOrder. That
means I have 6 more generated variants (36 if I'm foolish enough to allow
extensions to the first 6). Several challenges arise:
 
First, I must make sure that my derived, extended set also follows the
original cardinality constraints. This also means that I must invent a
constraint language that mirrors the extensibility of my schema. 
 
The big challenge comes when making sure that any user of the generated,
extended variants uses the correct one, from the correct set. What if I have
two layers of extension? More? In theory, it's possible, but practically
speaking, we guessed that getting the cascading schemaLocation hints correct
would be a significant challenge.
 
Another, somewhat unrelated reason for rejecting this approach was that we
liked that we could use Schematron for other, non-cardinality-oriented
constraints, such as the much-sought-after co-occurrence constraint. With
development-time generation, the only things that could be transformed into
Schema were the things that were supportable in Schema. Co-occurrence and
other similar constraints could not be supported development-time, simply
because there is no equivalent to transform to in Schema.
 
I'd be happy to discuss this all further. I'd be even happier if at least
this subset of constraints was somehow incorporated into Schema. 
 
 
Mark
 
 
-----Original Message-----
From: Paul Kiel [mailto:paul@hr-xml.org]
Sent: Wednesday, July 31, 2002 12:18 PM
To: xmlschema-dev@w3.org
Cc: Mark Feblowitz
Subject: additional constraints validation variant
 
Greetings folks,
 
I have been working with the schematron "adding additional constraints"
issues that are most accurately addressed in the OAGIS8.0 design.  This
design solves the problem quite well of the desire for a single general
model that is constrained by context.  Nice job folks!  (For example, in one
of our cases having a general HR-XML TimeCard with contextual variations
such as "DeleteTimeCard", "CreateTimeCard", "UpdateTimeCard" etc.)
 
The use of schematron here is perfect.  I would like to add a wrinkle for
perhaps a variant to this approach.  The links below illustrate two methods
of achieving the same goals, both using schematron to document constraints.
However where these constraints are applied differs.  
 
The first link, "InstanceValidationFlow", shows how one may use a document
(in this case an HR-XML TimeCard) in a validation flow.  The two step
approach (parser plus xslt) works well.  
 
http://ns.hr-xml.org/temp/InstanceValidationFlow.gif
<http://ns.hr-xml.org/temp/InstanceValidationFlow.gif> 
 
The second link, XSDValidationFlow", shows a flow where validation occurs
via a derivation of the schema itself instead of the instance in a second
step.  This would maintain the goal of general model with context-specific
constraints but without a second step validation (which is where I get the
push back from my constituents who otherwise like the use of schematron).
 
http://ns.hr-xml.org/temp/XSDValidationFlow.gif
<http://ns.hr-xml.org/temp/XSDValidationFlow.gif> 
 
What do you think of this approach?  I haven't decided if I like it yet, but
I thought enough of it to merit a thread here.  
 
The development of a Constraints2XSD stylesheet would not be simple, but I
would think doable - and reusable!  [I talked with Mark Feblowitz about this
once and he was, I believe, intrigued by it -- Mark, is that the case??]
Might anyone out there be interested in collaboratively creating such an
animal?
 
Pluses and Minuses:
Method 1 - InstanceValidationFlow
+ transforming constraints to validating xslt easily replicated (i.e. via
schematron skeleton xsl)
- results in many xslts laying around for validation (one for each context)
- requires another validation layer via XSLT (performance)
 

Method 2 - XSDValidationFlow
+ single validation layer
+ makes most use of parser
- results in many schemas laying around for validation (one for each
context)
- xslt for transformation of constraints to xsd not developed (yet!?!)
 
 
 
 
W. Paul Kiel
HR-XML Consortium

Received on Wednesday, 31 July 2002 13:52:44 UTC