Re: What would count as an unbiased survey?

Thank you, Robin.  I agree with many of your points.  One clarification:

Robin Berjon wrote:

> Or more precisely, that a lot of languages are defined using 
> XML Schema — the distinction being that in a fair number of 
> cases that I've seen the schema may be in the specification 
> but then no one ever uses it as part of the production chain.

Do you mean XSD is not used in production for validation or also that it's 
not used by tooling. 

If you mean in production for validation, then I think that's mostly true, 
except insofar as some databining tools do some sorts of input checking 
that are indirectly driven from schemas used to create the bindings. Also, 
it's my impression that XSD validation is often used for testing prior to 
production deployment, or perhaps for problem diagnosis.  A certainly 
don't have quantitative information to back that up.

If you mean that XSD isn't quite commonly used in production in 
conjunction with tooling that creates databindings, or that helps with the 
preperation of instance documents, I'm surprised.  Not that such use is 
universal, but I would have thought it was common, and one of the reasons 
that many vertical standards do use XSD for their normative schames. Isn't 
this the typical way that schemas in WSDL are used, for example, e.g. with 
Visual Studio, its Eclipse-based Java competitors, etc.? 

Thank you.


Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142

Robin Berjon <>
Sent by:
05/29/2009 09:46 AM
        To:     "Henry S. Thompson" <>
        cc:, (bcc: Noah Mendelsohn/Cambridge/IBM)
        Subject:        Re: What would count as an unbiased survey?

On May 29, 2009, at 14:41 , Henry S. Thompson wrote:
> All of which raises the question: What _would_ constitute reliable
> evidence of frequency of usage of the four major schema languages
> (DTD, XSD, Relax NG, Schematron)?
> Note once again in closing that this is _not_ a "my language is better
> than your language because more people are using it" discussion, but
> rather an attempt to support the proposition that maintaining and
> improving W3C XML Schema is important for W3C because it has a
> substantial user community on (and off) the Web.

If a survey is made, a good model could perhaps be the State of the 
Web 2008 survey:

That being said, sorry if I'm a little thick but I decreasingly 
understand what we would be trying to achieve with such a survey? As 
passionately as I may dislike XML Schema, there is little doubt that 
it is used a lot out there. Or more precisely, that a lot of languages 
are defined using XML Schema — the distinction being that in a fair 
number of cases that I've seen the schema may be in the specification 
but then no one ever uses it as part of the production chain.

Which brings me to my primary point: more than reflecting on what 
schema language should get what resources in times of constraint, 
shouldn't we take a step back, look at how XML is used today, and 
figure out how we would like to help and shape it over the next 
decade? I'm willing to bet that any serious take on this will shake 
out precisely what needs be done around the schema issue — and will be 
more generally useful.

In my experience, an awful lot of users out there pick XML Schema 
either because they don't know about the alternatives, or because 
"it's the W3C thing". Since as often as not they don't actually put 
the schema to use — in fact it's not rare that their schemata would be 
seriously broken, even though betters tools are reducing that issue — 
they don't really have an opinion on whether XML Schema or anything 
else is what they need. It's just there.

This is strongly tied to the misperception that when defining a 
language one absolutely needs to have a schema, and the joint 
misperception that having defined a schema, processing rules for user- 
agent and versioning strategies are naturally solved issues.

What I'm getting at here is that I believe that what we most need is 
some form of guidance — possibly in the form of best practices — for 
the usage of XML in various situations. Whether W3C is the best place 
to define these (given that they wouldn't produce a standard) is an 
exercise I'm happy to leave to another day — though an IG could 
perhaps work.

There's a lot of experience in defining XML languages from various 
groups, notably inside W3C. It would be great at the very least to 
document that. It seems to me that whenever I stumble into a new 
standardisation effort both inside W3C and outside (e.g. OMTP, OMA, 
3GPP, TV Anytime, MPEG...) I have to restate the very same things 
about picking the schema language that best fits based on genuine use 
cases and requirements rather than "just because", on the fact that a 
schema is rarely enough for validation, on versioning strategies, on 
user-agent error-handling, on processing models... and my throat feels 
rather raw from all that typing, especially since it usually takes a 
few rounds of discussion before people start to understand the 
problem. I'd love to have something to point at instead.

I took a first semi-serious stab at it for the latest XML Prague (PDF: 
  , slides + video:

  and it was well received — in fact I'm still getting questions about 
it even from people who weren't there. I have neither the time, the 
resources, nor the experience to elaborate on that in a solid, serious 
manner that will cover enough of the XML-using community — but I think 
that identifying problematic usage patterns will get us closer to 
something widely useful than figuring out which schema language is 
used most. The broader view on XML usage ought to shake out what 
really needs resources.

Robin Berjon -

     Feel like hiring me? Go to

Received on Friday, 29 May 2009 14:13:34 UTC