What would count as an unbiased survey?

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

In the TAG meeting of 28 May, Tim Berners-Lee criticised the
admittedly _ad-hoc_ methodology of the evidence I presented there [1],
wrt the question of what the relative scale of usage is as between W3C
XML Schema and other schema languages.

I just tried another, also ad-hoc, one:

Google says there are 

  approx 255000 hits for

     training "xml schema"

 and

  approx. 248000 hits for 

     training "xml schema" -relax

 and

  approx 14600 hits for

      training "relax ng"

 and

  approx 12100 hits for

      training "relax ng" -"w3c xml schema"

which suggests that the Web thinks there's a bigger market for
training Schema as opposed to Relax, by an order of magnitude, but
I expect there are flaws in that too. . .

Google finds

  approx 24300 hits for "XML Schema Part 1: Structures"

and

  approx 2920 hits for "Relax NG Specification"

but that's looking in a rather different direction. . .

I agree with what Larry Masinter said, that there is no problem with
multiple specifications in this area, and I admire both Relax NG and
Schematron, and still use DTDs as well.  The _only_ reason for
pursuing this question is to rebut the proposition, often advanced but
not, to my knowledge, ever substantiated, that W3C XML Schema is not
used very much, so e.g. delaying the next version is not a big deal.

Tim BL says, in the above-cited minutes:

  But what about private use behind firewalls?

In an earlier email Paul Cotton cited some Google code figures (again
showing roughly an order of magnitude imbalance towards XSD) and then
said:

  Personal opinion: I expect that the ratio in enterprise systems
  whose code stores are not visible to a tool like "Google code" that
  this ratio would be even more slanted towards XML Schema.

All of which raises the question: What _would_ constitute reliable
evidence of frequency of usage of the four major schema languages
(DTD, XSD, Relax NG, Schematron)?

Note once again in closing that this is _not_ a "my language is better
than your language because more people are using it" discussion, but
rather an attempt to support the proposition that maintaining and
improving W3C XML Schema is important for W3C because it has a
substantial user community on (and off) the Web.

ht

[1] http://www.w3.org/2001/tag/2009/05/28-minutes.html#item05
- -- 
       Henry S. Thompson, School of Informatics, University of Edinburgh
                         Half-time member of W3C Team
      10 Crichton Street, Edinburgh EH8 9AB, SCOTLAND -- (44) 131 650-4440
                Fax: (44) 131 651-1426, e-mail: ht@inf.ed.ac.uk
                       URL: http://www.ltg.ed.ac.uk/~ht/
[mail really from me _always_ has this .sig -- mail without it is forged spam]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)

iD8DBQFKH9fvkjnJixAXWBoRAu7nAJ4iVB9hBOHjdtbg5CSgbIWBuprHEgCfToLW
mgi+UEWw+d2bHtdNF2Ab2Tc=
=ZBLI
-----END PGP SIGNATURE-----

Received on Friday, 29 May 2009 12:41:52 UTC