XSD-Schematron Integration from Jack Lindsey on 2005-11-22 (www-xml-schema-comments@w3.org from October to December 2005)

From: Jack Lindsey <tuquenukem@hotmail.com>
Date: Tue, 22 Nov 2005 17:53:55 -0500
To: www-xml-schema-comments@w3.org
Message-ID: <BAY102-F6D90195B194AF0F82CD5CD7520@phx.gbl>
Dear Working Group members:

This is a heart-felt plea for you to embrace Schematron as a W3C 
Recommendation and to provide tight, if not seamless, integration between 
W3C Schema and Schematron in order to provide long-needed, comprehensive 
functionality in the area of co-occurence constraints and co-occurence 
constraints by value.

In addition, this is meant to imply that any plans to enhance this kind of 
functionality within XSD should be dropped in recognition of the fact that 
the synergy between the two vocabularies, in terms of their complementary 
paradyms of data structure definition and rule-based constraints, will 
provide optimal usability that could not be matched by other means and will 
prevent duplication of functionality and wasted standardization efforts.

I have attached my war stories post in xmlschema-dev as background.  Again, 
I apologize for the brickbats hidden in the roses, but this is a topic that 
has long caused me grief and I believe you should make it your top priority 
because I know many others are wrestling with this.

Yours sincerely

Jack Lindsey

>From: "Jack Lindsey" <tuquenukem@hotmail.com>
>To: noah_mendelsohn@us.ibm.com
>CC: brs@itst.dk, mike@saxonica.com, petexmldev@tech-know-ware.com, 
>xmlschema-dev@w3.org
>Subject: Re: SV: SV: SV: SV: Schema help
>Date: Sun, 20 Nov 2005 01:12:42 -0500
>
>
>XSD-Schematron Synergy
>
>I can't believe the Schema WG is even considering using its apparently 
>meagre resources to address co-occurence constraints.  I always assumed the 
>lack of any progress in this area was due to philosophical objections among 
>your luminaries.
>
>The strength of XSD is that it is easy (using graphical IDEs anyway) to 
>quickly model extensive data structures.  Doing this in Schematron would be 
>lengthy and laborious.  But Schematron's rule-based approach is ideal for 
>specifying co-occurence constraints and co-occurence constraints by value, 
>that:
>
>1) can be expressed either positively or negatively,
>2) can be applied equally to both elements and attributes, and
>3) can create dependencies between schema elements regardless of their 
>relative positions within a data structure.
>
>I can't imagine how you could cobble this functionality on to XSD and 
>retain its simplicity and elegance.  What is more, Schematron achieves this 
>through XPath expressions and so everything is still within the W3C family. 
>  Except that Schematron itself is (still) about to become an ISO standard. 
>  Perhaps there is still time to repatriate it?  Together they are very 
>complementary - a great tag team.
>
>I agree with Bryan that something better than the appinfo invocation is 
>required.  I hear that some consider that a potential security flaw (the 
>GJXDM community?).  Also I think I read recently that there is already a 
>.Net thing that let's you specify both a schema and a Schematron stylesheet 
>and executes them in a single phase???
>
>Perhaps some of your resources would be better spent making sure that 
>functionality is not duplicated and the W3C family of vocabularies does not 
>become more un-orthogonal (?).  Case in point, Bryan's other recent post 
>about the validation incompatibility between XSD and XInclude concerning 
>xml:* attributes.  There again Microsoft is helping its users over the bump 
>in the road with its own improvised solution.  Too much of that kind of 
>thing can't be good for the W3C's health!
>
>xsd:restriction Headaches
>
>Harking back to the example at the beginning of this thread, both E-R and 
>UML modelers would find it very natural to implement a taxonomy of task 
>types using XSD's Substitution Group feature.  A supertype complexType 
>called TaskType would define the common task elements.  Then subtype 
>complexTypes would be derived by extension to add the specific elements 
>required by TaskType1 versus TaskType2, etc..  The TaskType1 and TaskType2 
>elements would then declare themselves members of the Task element's group. 
>  That way they could be validly substituted in any situation where Task 
>could be used.  So in an instance, instead of:
>
><Task>
>    <!-- common task elements here -->
>    <TaskType1>
>        <!-- Task 1 things -->
>    </TaskType1>
></Task>
>
>You get:
>
><TaskType1>
>    <!-- common task elements here -->
>    <!-- Task 1 things -->
></TaskType1>
>
>The tags retain their semantic value and inheritance cures the bloat.  I 
>have depended on this approach to implement multi-level class hierarchies 
>and I think it is a great feature of XSD.
>
>However, this kind of inheritance can often result in the availability of 
>elements that do not make sense in the context of certain leaf level 
>subtypes.  And in general, depending on the stage in a business process, 
>XML transactions with very similar content in fact have different rules 
>concerning which elements are required, optional or prohibited.  Then 
>everyone wants to start applying restrictions and the first instinct is to 
>screw everything to the floor by applying XSD's validation features to the 
>extreme, especially your DBA folk. This is where the xsd:restriction 
>headaches begin.
>
>1) They are a maintenance liability because you have to spell out 
>everything you want to allow again, except for attributes which are the 
>opposite because by default they are all permitted unless you specifcally 
>prohibit them (I did once convince myself that that made some sense but 
>I've forgotten why!).
>
>2) You cannot apply both extensions and restrictions in a single step, 
>forcing the creation of intermediate artifacts that are not intended for 
>use by anyone and serve only to confuse and sometimes alarm people you 
>thought were your closest friends (a slight change in syntax could avoid 
>this by allowing both restriction and extension elements within a 
>complexType definition - I'm not asking for any change in the underlying 
>rules, just the avoidance of the useless intermediate type.
>
>3) Creating partner-specific variants of a standard community schema 
>involving restrictions against deeply nestedl data models is hopelessly 
>impractical.  The resulting schema cuts the original to ribbons and is 
>barely still recognizable.
>
>This is the point where you say, "Y'know there is no way W3C Schema can 
>apply all the validation your programmers will demand, like co-occurence by 
>value, so is it really worth doing all this when it will be largely 
>duplicated by the receiving programs anyway?"
>
>So what is the answer?  Relaxed common schemas with context-specific rules 
>applied by Schematron-generated XSLT stylesheets.  At least that was the 
>conclusion OAGIS came to when they switched from DTDs to XSDs, and they 
>posted their disenchantment with xsd:restriction in this very forum.  How 
>many years ago was that?  Furthermore, how many times have Jeni, Mike and 
>others ended their responses with words like "...and you may want to 
>investigate other techologies such as Schematron to achieve what you want."
>
>But there is resistence because the project leaders say, "But Schematron 
>isn't W3C, and you want *another* validation phase?"   But this is where I 
>came in.
>
>I apologize in advance, for I am bound to be stepping on someone's corns 
>when I ask, might I be forgiven for suggesting that perhaps you guys on the 
>Schema WG just haven't been listening?
>
>Cheers
>
>Jack Lindsey
>
>
>
>
>>From: noah_mendelsohn@us.ibm.com
>>To: Bryan Rasmussen <brs@itst.dk>
>>CC: "'Michael Kay'" <mike@saxonica.com>,        
>>"'petexmldev@tech-know-ware.com'" <petexmldev@tech-know-ware.com>,        
>>"'xmlschema-dev@w3.org'" <xmlschema-dev@w3.org>
>>Subject: Re: SV: SV: SV: SV: Schema help
>>Date: Fri, 18 Nov 2005 09:39:31 -0500
>>
>>
>>I  understand, and although I don't speak officially for the workgroup, I
>>want to be sure you feel that your suggestions are being heard.  One thing
>>that would help, if you have not already done so, would be to mail this
>>suggestion to www-xml-schema-comments@w3.org, which is the official
>>comments list for the schema specification.  We formally review the
>>comments received at that list, and we either open new trackable issues or
>>ensure that issues we are already tracking cover them.  Please make clear
>>that you are specifically endorsing schematron as a solution, as otherwise
>>twithhe WG might just view this as just another request for some form of
>>co-constraints, and that's been a tracked request for some time.  I'm also
>>copying David Ezell, or WG chair, on this reply.   Thank you very much.
>>
>>--------------------------------------
>>Noah Mendelsohn
>>IBM Corporation
>>One Rogers Street
>>Cambridge, MA 02142
>>1-617-693-4036
>>--------------------------------------
>>
>>
>>
>>
>>
>>
>>
>>
>>Bryan Rasmussen <brs@itst.dk>
>>11/18/2005 03:31 AM
>>
>>         To:     "'noah_mendelsohn@us.ibm.com'"
>><noah_mendelsohn@us.ibm.com>
>>         cc:     "'petexmldev@tech-know-ware.com'"
>><petexmldev@tech-know-ware.com>, "'xmlschema-dev@w3.org'"
>><xmlschema-dev@w3.org>, "'Michael Kay'" <mike@saxonica.com>
>>         Subject:        SV: SV: SV: SV: Schema help
>>
>>
>>
>>Well on the subject of co-occurence constraints I would just like to
>>reiterate what I said earlier, with some extension:
>>
>>Given that there is likely to be some argument in W3C as to how far such
>>constraints should be implemented I doubt they will come out as powerful
>>as
>>Schematron constraints, furthermore I have a hard time seeing this as
>>producing a syntax as nice as Schematron, therefore I would really like to
>>see something like:
>>
>>1. XML Schema adopts Schematron as an extension language of some sort.
>>2. XML Schema puts some thought into how Schematron can be combined with
>>XML
>>Schema to the benefit of both, beyond the normal  method of drop
>>schematron
>>in appinfo.
>>
>>I have some ideas on #2, but I'm somewhat conflicted about them - what
>>model
>>makes sense, syntax etc. so I don't really want to just blurt out with it.
>>I'd be more interested in hearing what kinds of things other people could
>>see connecting the two languages.
>>
>>Cheers
>>Bryan Rasmussen
>>
>>-----Oprindelig meddelelse-----
>>Fra: noah_mendelsohn@us.ibm.com [mailto:noah_mendelsohn@us.ibm.com]
>>Sendt: 17. november 2005 18:51
>>Til: Bryan Rasmussen
>>Cc: 'Michael Kay'; 'petexmldev@tech-know-ware.com';
>>'xmlschema-dev@w3.org'
>>Emne: Re: SV: SV: SV: Schema help
>>
>>
>>Well, I think there are good reasons from time to time to revisit the
>>effectiveness of the W3C process and the compromises embodied therein. I'm
>>
>>not convinced that a deep dive on that is the best use of this particular
>>mailing list.   I happen to like the working groups I've been on that do
>>their work in public (in my case, both the TAG and XMLP) and I'd be happy
>>for schema to go the same way.  Then again, I really don't think that's a
>>substitute for having people who have 30% of their time committed to
>>working on a technology.  There's a lot of detail work and care required
>>to revise a specification even if there's agreement on the general ideas.
>>The discussions need to involve people who have the knowledge and the time
>>
>>commitment to work through interactions with existing features of the
>>specification.  In the case of co-constraints, it would seem to me that
>>there ought to be a careful look taken at the relationship between the
>>existing key/keyref/unique constraint mechanisms and anything new that's
>>proposed.  It would be nice to believe that we wouldn't just be sprouting
>>new and uncoordinated ways of doing things every few years.
>>
>>So, I personally welcome broader input, but what we're really short of are
>>
>>the people who can edit the specification text, draft prose, be
>>responsible for the details, etc.  Of course, there are also lots of other
>>
>>messy issues to consider when you change the working mode of a group
>>including anti-trust laws in various jurisdictions, IP issues, etc.  If
>>people feel that they have ideas for how the W3C can do these things
>>better, I think the right place to go would be to the W3C staff and/or the
>>
>>workgroup chairs.  I personally would not be against having the schema WG
>>switch to using a public mailing list for its discussions.  I suspect that
>>
>>requires a recharter, but in principle I'm fine with it.  I don't think
>>that will solve much of our resource problems.  We don't lack for people
>>with good ideas, in email or in person.  We're missing the people to do
>>the archticture and drafting work that goes into making all the details
>>fit together.   It's hard to do that well without meeting F2F from time to
>>
>>time.
>>
>>--------------------------------------
>>Noah Mendelsohn
>>IBM Corporation
>>One Rogers Street
>>Cambridge, MA 02142
>>1-617-693-4036
>>--------------------------------------
>>
>>
>>
>>
>>
>>
>>
>>
>>Bryan Rasmussen <brs@itst.dk>
>>Sent by: xmlschema-dev-request@w3.org
>>11/17/2005 04:59 AM
>>
>>         To:     "'Michael Kay'" <mike@saxonica.com>
>>         cc:     "'xmlschema-dev@w3.org'" <xmlschema-dev@w3.org>,
>>"'petexmldev@tech-know-ware.com'" <petexmldev@tech-know-ware.com>, (bcc:
>>Noah Mendelsohn/Cambridge/IBM)
>>         Subject:        SV: SV: SV: Schema help
>>
>>
>>
>>Damn, an earlier typo in the email address of Pete Cordell added in by me
>>was replicated in your email. Just on the off chance this thread goes any
>>further I thought I should correct it. I've cc'ed Pete on this mail. Sorry
>>for the problem.
>>
>>Cheers,
>>Bryan Rasmussen
>>
>>-----Oprindelig meddelelse-----
>>Fra: Michael Kay [mailto:mike@saxonica.com]
>>Sendt: 17. november 2005 10:47
>>Til: noah_mendelsohn@us.ibm.com; Bryan Rasmussen
>>Cc: xmlschema-dev@w3.org; ',petexmldev@tech-know-ware.com'
>>Emne: RE: SV: SV: Schema help
>>
>>
>>
>> > 1) Although most widely used schema validators are fairly
>> > slow, one can in
>> > fact implement the XML schema rules at quite high speed.  My team is
>> > hoping to publish some work in that area in coming months,
>> > and I suspect
>> > that others in the industry are working in the same
>> > direction.  I think
>> > it's important to the success of any technology we choose
>> > that it be able
>> > to meet the performance needs of our customers.
>>
>>I would resist this kind of thinking. SQL was successful because it put
>>functionality first, and left implementors to devise optimisation
>>strategies. Users need a constraint language that is capable of expressing
>>arbitrary constraints on the content of a document, and it should be left
>>to
>>the implementor to work out which of these constraints can be evaluated in
>>streaming mode and which can't.
>>
>>SQL today allows the full power of the query language to be used to
>>express
>>integrity contraints, and users learn when they need to restrict their
>>ambitions to meet performance requirements. 90% of applications aren't
>>performance critical anyway.
>>
>>There's no point telling users to go and use some other technology to do
>>their validation, the other technology isn't going to be fast either.
>>
>>Michael Kay
>>
>>
>>
>>
>>
>>
>>
>
>_________________________________________________________________
>Take charge with a pop-up guard built on patented Microsoft� SmartScreen 
>Technology  
>http://join.msn.com/?pgmarket=en-ca&page=byoa/prem&xAPID=1994&DI=1034&SU=http://hotmail.com/enca&HL=Market_MSNIS_Taglines 
>  Start enjoying all the benefits of MSN� Premium right now and get the 
>first two months FREE*.
>
>

_________________________________________________________________
Don't just Search. Find! http://search.sympatico.msn.ca/default.aspx The new 
MSN Search! Check it out!
Received on Tuesday, 22 November 2005 22:54:11 UTC