Re: XML Schema usage statistics (WAS: Draft minutes of 2009-05-12 TAG weekly)

(I'm writing here as a TAG member, not as chair)

Ashok Malhotra wrote:

> I have some trepidation about this line of reasoning which would 
> seem to be: XML Schema is widely used, therefore it is good and 
> should continue!

I have some trepidation about what's going on here, but for somewhat 
different reasons.  After XSD 1.0 went to Recommendation, the XML Schema 
Working group was rechartered (more than once, FWIW).  Here are some 
quotes from the latest charter [1], under which the Candidate 
Recommendation [2,3] has been published:

"The XML Schema working group will maintain and revise the XML Schema 
specification developed beginning in 1998 and published as a W3C 
Recommendation on 2 May 2001. "

[...]

"Goals: to finish publication of version 1.1 of the XML Schema 
Recommendation, which corrects known errors and makes modest improvements 
to the language, and do preparatory work for a possible version 1.2. 
Changes in function or syntax incompatible with XML Schema 1.0 have been / 
will be made only if the resulting improvements compellingly justify the 
loss of interoperability with existing systems and documentation. Some 
substantive changes have been made in the interests of aligning version 
1.1 with the needs of the XML Query 1.0, XPath 2.0, and XSLT 2.0 family of 
specifications and with XML 1.1. Requests for substantive changes may also 
come from other groups. [plus other goals not quoted]"

Now, spurred by Rick Jelliffe's request [4], we're asking a question that 
boils down to:  "shouldn't the W3C cancel this effort to provide 
incremental improvements to Schema 1.0, and instead start on a new, 
cleaner language?"

I strongly believe that this is a question that should have been settled, 
and indeed was settled, when the working group was chartered with the 
above goals.  The charter very clearly says:  build on the XSD 1.0 base, 
and to the extent possible, retain syntactic compatibility. (There is a 
later goal that allows for experimentation with new syntax too, but that's 
in addition to not instead of enhancing the existing syntax; nowhere is a 
brand new language core discussed.)  The W3C membership has every 
opportunity to provide guidance on the content of such charters, and the 
time to consider proposals for a new language would have been when the 
charters were written.

To change the goals of an effort like this now is not only counter to the 
letter of the W3C process, it's hugely disruptive both in this particular 
case and as a precedent.  There's no way that people like me are going to 
devote years to working in the W3C, toward agreed goals, if at the end we 
say: never mind, those weren't the goals. 

Rick raises some interesting and important technical points about XSD.  No 
doubt it has shortcomings, though I don't necessarily agree with all that 
he lists.  I also think XSD has some strengths, which he tends to 
de-emphasias, and FWIW the other languages have their own shortcomings, 
but my point here is not to claim that XSD is better, or that if I were 
starting from scratch I might not look very hard at just the technical 
direction that Rick proposes.  The fact is that most of these concerns 
have been understood in general for a long time and Rick among others has 
raised them for a long time.  Most of them were understood when the 
decision was made to create a charter that would focus on improving the 
experience of the many users who have adopted the W3C XML Schema 
Recommendation.  We, the W3C, decided to invest in maintaining and 
enhancing a Recommendation that was being widely adopted.

Indeed, evidence is clear that there is very widespread use of XSD, 
arguably extraordinarily widespread use of XSD, and so lack of adoption is 
in no way a reason to revisit the charter goals now.  That is, IMO, the 
only reason we are in this thread considering relative rates of adoption 
of these languages at all.  The fact is that XSD 1.0 is very widely used, 
and XSD 1.1 is designed to make the language more valuable for the many 
users who have invested in it.  XSD 1.0 is also a W3C Recommendation, and 
while I have no problem with the W3C considering alternative languages on 
the merits from time to time, the presumption should be that we support 
and maintain our Recommendations, and that we honor our agreed charters.

Noah

P.S. The question of which schema languages are how widely used remains an 
interesting one, and if I turn up any useful facts based on my inquiries 
in IBM [5] I will pass them on.  So far, all the evidence I've seen 
suggests that XSD is more widely used than the other languages by many 
measures, and by quite a significant margin, though there are interesting 
communities that strongly prefer RelaxNG and/or Schematron.  There appear 
to be more .xsd documents accessible on the Web; I believe XSD is used by 
more widely-deployed tooling; and preliminary investigations suggest that 
XSD is used normatively by more "vertical" XML standards (some using XSD 
alone, and some using XSD+Schematron) than the alternatives. Of course, 
XSD also forms the type system for W3C XML Query, XSLT 2.0, and XPath 2.0. 
 As I say, I'm still trying to check the facts on those adoption claims, 
and I'll pass on what I can. 

[1] http://www.w3.org/2006/06/XML/schema-wg.html
[2] http://www.w3.org/TR/2009/CR-xmlschema11-1-20090430/
[3] http://www.w3.org/TR/2009/CR-xmlschema11-2-20090430/
[4] http://lists.w3.org/Archives/Public/www-tag/2009May/0021.html
[5] http://lists.w3.org/Archives/Public/www-tag/2009May/0046.html

--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------








ashok malhotra <ashok.malhotra@oracle.com>
Sent by: www-tag-request@w3.org
05/18/2009 07:05 PM
Please respond to ashok.malhotra
 
        To:     "T.V Raman" <raman@google.com>
        cc:     www-tag@w3.org, (bcc: Noah Mendelsohn/Cambridge/IBM)
        Subject:        Re: XML Schema usage statistics (WAS:  Draft 
minutes of 2009-05-12    TAG  weekly)


I have some trepidation about this line of reasoning which would seem to 
be:
XML Schema is widely used, therefore it is good and should continue!

I think we need to ask some more nuanced questions.  For example
1. Clearly all the statistics are based on Schema 1.0.  Are the 
additions in 1.1 beneficial, necessary or excess baggage?
Should the Schema WG be rechartered to add yet more features.
2. Is there a core subset of features in XML Schema that is heavily used 
and can be isolated?  If so, should we consider a profile?

I'm sure you smart folks can think of other good questions!
All the best, Ashok


T.V Raman wrote:
> It would also be enlightening to find out how many of those XSD
> files were generated from rng/ files. I know for a fact that many
> groups inside W3C  routinely produce their obligatory xsd schema
> for their specs by first creating rng files.
>
> Julian Reschke writes:
>  > Paul Cotton wrote:
>  > > From the draft May 12 TAG minutes:
>  > > 
>  > >> raman: XML Schema hasn't worked out very well. I'm skeptical that 
it 
>  > > really dominates
>  > > ...
>  > >> timbl: Skeptical about preponderance of XSD usage, would like to 
see some 
>  > > figures
>  > >> noah: Any volunteers?
>  > >> (silence)
>  > > 
>  > > Searching Google code for .xsd files (
http://www.google.ca/codesearch?hl=en&lr=&q=file%3A.*%5C.xsd%24) finds 
44,800 files.
>  > > 
>  > > Searching Google code for .rng files (
http://www.google.ca/codesearch?hl=en&lr=&q=file%3A.*%5C.rng%24) finds 
only 3,000 files.
>  > > 
>  > > Not necessarily a reliable survey but it certainly indicates that 
in publicly visible code stores indexed by "Google code" .xsd file 
occurrence is significantly greater than that of Relax NG files. 
>  > > 
>  > > Personal opinion: I expect that the ratio in enterprise systems 
whose code stores are not visible to a tool like "Google code" that this 
ratio would be even more slanted towards XML Schema.
>  > > 
>  > > /paulc
>  > > ...
>  > 
>  > Plus ~1000 in RNC (Compact) format.
>  > 
>  > It would be interesting to have a comparison of the # of 
specifications 
>  > that use XSD, RNC, or RNG as part of the spec text.
>  > 
>  > BR, Julian
>
> 

Received on Tuesday, 19 May 2009 01:25:57 UTC