- From: <noah_mendelsohn@us.ibm.com>
- Date: Fri, 22 May 2009 12:02:36 -0400
- To: Rick Jelliffe <rjelliffe@allette.com.au>
- Cc: www-tag@w3.org, Paul Downey <paul.downey@bt.com>
Rick Jelliffe writes: > I would say that in essence XSD 1.1 ignores the main issue from > the workshop. Well, the W3C as a whole did not ignore that. As you are probably aware, a working group [1] was chartered to identify a profile, and they tried [2], having produced a working draft in 2007 that appears not to have been updated since. I think it's fair to say that, during the period the group was most active, they struggled to get a critical mass of people to contribute. The Schemas working group provided detailed comments [3] on that draft(including a few editorial comments from me personally.) Frankly, I think the databindings WG fell into the trap of making a taxonomy of lots of little schema features, without successfully zeroing in on the layerings that would be meaningful to users and to achieving the deeper modularity that I think you're looking for. My point is: after the workshop, the responsibility for suggesting a profile or modularization was given not primarily to the XSD group, but to the Databindings group. It was noted by many that for all the talk about the importance of the profile (and I believe that done right it would have been very useful to the databinding community), relatively few people were willing to commit to do the work. So, that modularization effort foundered, as best I can tell, not due to lack of followup by W3C, but by lack of investment by those who would have had to provide resources to do the work (though some organizations did invest, and those people work hard, for which I think everyone is grateful.) I think it's also worth noting that in my informal discussions with people from the databinding community, it's not at all clear that the profiles they wanted were the ones that either you or I would consider interesting to make XSD cleaner. As I recall, the two features that came in for the most regular criticism on XSD from that community were: * mixed content * <xsd:choice> <xsd:choice> is, of course, the direct analog of "|" in DTDs, and mixed content is actually a feature of XML (Schemas got blamed because it was in the schema that the databinding community would become aware of the presence of mixed content.) While there are many possible legitimate criticisms of undue complexity in XSD, I doubt that either of us considers "choice" or mixed content to be prime examples. In summary: the workshop conclusion on modularity and profiles was not ignored, but was considered sufficiently important to merit a working group, which was chartered. As David has indicated, the XSD group proceeded with its responsibility to add the other enhancements that were requested by attendees. I don't think all of that history provides very good justification for delaying XSD 1.1 now. Noah P.S. There's a real risk that I've misrepresented some details of the databingins history, so I'm cc:'ing Databindings WG chair Paul Downey. Paul, this email is in the context of a long and somewhat fraught email thread resulting from a proposal from Rick Jelliffe that XSD 1.1 be held at the Candidate Recommendation stage while an attempt is made to factor the specification in a more modular way. Before commenting, you should probably read the whole thread on www-tag :-(. As a start, Rick's original request is at [4] and a clarification from him is at [5]. Many of us have also commented in that thread and perhaps others. [1] http://www.w3.org/2002/ws/databinding/ [2] http://www.w3.org/TR/2007/WD-xmlschema-patterns-20071031/ [3] http://lists.w3.org/Archives/Public/public-xsd-databinding-comments/2008Feb/0000.html [4] http://lists.w3.org/Archives/Public/www-tag/2009May/0021.html [5] http://lists.w3.org/Archives/Public/www-tag/2009May/0063.html -------------------------------------- Noah Mendelsohn IBM Corporation One Rogers Street Cambridge, MA 02142 1-617-693-4036 -------------------------------------- Rick Jelliffe <rjelliffe@allette.com.au> Sent by: www-tag-request@w3.org 05/21/2009 12:37 AM To: www-tag@w3.org cc: (bcc: Noah Mendelsohn/Cambridge/IBM) Subject: Five mechanical approaches to make an XSD profile without getting bogged by individual issues David Exellwrote: In essence, XML Schema 1.1 addresses the issues from the workshop. I would say that in essence XSD 1.1 ignores the main issue from the workshop. When I look at the Chair's report (linked to by David) I read: > There was significant support for the idea of a written ‘profile’ of XML Schema > which would document the sweet spot for purposes of data binding, or for other > specific domains. The word /profile/ is problematic; what was meant was not a > language subset, but only a definition of the sweet spot in existing processors, > which would allow schema authors to get better results and better user experience > when data binding tools are used, and which would tell implementors in the > relevant domain which parts of schema users are most likely to expect them to > support well. > There was strong sentiment against publishing any profiles which would restrict > or reduce the XML Schema 1.0 specification, impacting existing implementations > or vocabularies. My request for a profile does not reduce or restrict or otherwise define what is in full XML Schema 1.n. So what happened to this "significant support"? The key is in the next paragraph: > There appeared to be no obvious way to split the XML Schema specification > into layers or sub-languages, as with OWL Lite, DL and Full or SVG Tiny, > Basic and Full. Accordingly, there was no support for trying to define profiles > of XML Schema as part of the schema language itself. However, many people > saw value in application or domain specific 'profiles', in particular identifying > a set of schema patterns to provide a 'good user experience' when using > XML Schema 1.0 to bind XML to code or data models. And how long was this discussion that decided that there was "no obvious way"? Well, the formal discussion on this seems to have occupied 15 minutes of time, which ran out before discussions had finished. Indeed, as far as I can see, no straw proposals were asked for, raised, considered or dispatched. For my rather immoderate response to that event, see my blog item from the time: Snow Season in Schemaland http://blogs.oreilly.com/digitalmedia/2005/07/snow-season-in-schemaland.html So what are obvious ways? Here are five: ----------------------------------------------------------------------------------- 1) Exchange model One of the biggest early success stories in vendor-cooperative standards setting was the OASIS CALS Exchange Table Model: now it is part of history though it has influenced all subsequent table models since. Michael, David, Norm and the other old-timers will certainly remember it. The military CALS table model was based on going through all the tables in the archive and making a schema (DTD) that could cope with them all. It supported lots of fancy things (tables on call-out pages with different page size, etc). Most vendors could only support a subset. So they got together, and rather than dispute each feature, they agreed on an algorithm: where almost all vendors supported a feature, it would be kept and the vendors would agree to support it; otherwise it would be dropped. There are now several profiles out: the W3C databinding minimum and maximum, the WS-I profile, the UN profile, etc. An algorithmic approach like the CALS approach could be used. ----------------------------------------------------------------------------------- 2) Modularity model Chop the 250 page Structures plus the datatypes specs into different severable parts: 1) Grammars and particles 1a) Additional constraints 2) Key and uniqueness 3) Assertions 4) Built-in Datatypes 5) Schema location and assembly 6) Complex type derivation and assembly 7) Simple derivation 8) Dynamic schema constructs: xsi:nil, xsi:type, version selection 9) PSVI and encourage implementators to implement fully each part that they implement. ----------------------------------------------------------------------------------- 3) Set-based selection 1) Start with a private syntax for ISO/OASIS RELAX NG using XSD-namespace elements. (Call it RELAXSD) This gives a solid theoretical basis and proven capabilities with little work. 2) Create an extra layer of syntax and semantic checking on RELAXSD (Call it XSD Lite and Tite) to implement the appropriate rules of XSD 1.n and remove patterns specified in the maximum W3C databinding note. 3) Adjust RELAXSD to remove any syntax that is removed by XSD Lite and Tite if necessary. (Call this XSD Lite) The result: * all XSD Lite documents can be trivially converted to RELAX NG * all XSD Lite and Tite documents are conforming XSD 1.n documents * all XSD Lite and Tite documents are usable by XSD Lite systems. XSD Lite would meet the needs of those for whom ambiguity is not an issue. XSD Lite and Tite would meet the needs for those for whom ambiguity was an issue. Both would be fairly equivalent to DTDs with simple types. Neither would use the bogus complex type *derivation* apparatus, though they certainly could be declared as a name binding to a complexType, and they could be imported and used in a full XSD 1.n system that had complex typing. ----------------------------------------------------------------------------------- 4) Resolved schemas Many of the features of XSD are syntactic sugar. They may be useful for modelling, but they do not actually add any expressive power. And they come at a heavy cost. A resolved schema would be one in which a full XSD 1.n schema had been re-written to remove syntactic sugar (such as element substitution and complex type derivation by extension,) and modeling items (such as complex type derivation by restriction and abstract elements.) In fact, this is how the RELAX NG specification is written: first a transform to resolve the sugar and then formal description of the remaining core. It is also how I implemented my XSD validator, which converts to Schematron. ----------------------------------------------------------------------------------- 5) Schema versus Instance validation When implementing XSD it becomes obvious that there are two very different kinds of constraints involved. They can be seen starkly in the test suite: some tests require an instance, some do not. The specification could be refactored into two parts: 1) Validation that the XSD schema is correct 2) Validation that an instance is correct against the XSD schema. For example, my implementation largely assumes that the schema is correct. This represents a major simplification in the work involved. For example, I suggest that of the implementers who require UPA, there are many who would prefer (and perhaps don't) check the schema for UPA and just rely on runtime violations if any. ----------------------------------------------------------------------------------- 6) Implementation caused Create a profile which removes any features that have been shown to have caused implementation problems. The W3C databinding profiles are relevant, though the metric is not "what has been implemented?" but "what has been implemented badly/with difficulty/wrongly/abandoned? I.e. some features may be abandoned because of mismatch with an underlying model: this is no reason to ditch them under this method. But a feature that perhaps was needed and missed the mark could be ditched. (This is more like my original suggestion in my submission to the W3C Workshop.) I think this method is now superceded by events and information and does not need to be considered. ------------------------------------------------------------------------------------ Each of these 5 methods would allow a mechanical split/layering/refactoring/profiling of the standard. They obviously each have their pros and cons (I would be happy with any of them). And the shape of the final profile would be pretty much determined and knowable upfront by the mechanism chosen: the judgements are not matters of expertise or subjectivity. I would like to note that I do not believe the XSD WG has ever called for submissions on how to refactor or profile the spec. So I don't believe that they have indeed addressed a main issue from the W3C Workshop. In fact, the birthing baby has gone straight into the too-hard basket without even a slap, to mix metaphors. Cheers Rick Jelliffe
Received on Friday, 22 May 2009 16:01:23 UTC