Re: Study of XML quality on the Web from Liam R E Quin on 2011-11-20 (xmlschema-dev@w3.org from November 2011)

From: Liam R E Quin <liam@w3.org>
Date: Sat, 19 Nov 2011 20:30:26 -0500
To: Michael Kay <mike@saxonica.com>
Cc: xmlschema-dev@w3.org, Jeni Tennison <jeni@jenitennison.com>
Message-ID: <1321752626.12601.66.camel@desktop.barefootcomputing.com>

On Sat, 2011-11-19 at 22:31 +0000, Michael Kay wrote:
> On 19/11/2011 17:27, Noah Mendelsohn wrote:
> > I'd think readers of this list would be interested in a very nice 
> > study of the quality of XML documents on the Web. [1] The study was 
> > done by Steven Grijzenhout and Maarten Marx.

It's nicely presented but I agree with Michael Kay that some essential
information is missing: what proportion of the documents were XHTML/HTML
and/or RSS?

In addition, as Michael Kay again noted, it's not an error for a
schemaLocation hint to fail to resolve; furthermore, it's common
(normal, in fact) for an XML document to be valid against a schema (XSD
or RNG) without actually mentioning that schema explicitly.

As for the authors' thesis that "high quality" XML is necessary for
XQuery In The Browser to be useful, I think the same applies to XSLT
(e.g. SaxonCE), and in fact by operating on the browser's DOM it's
presumably much lss of an issue in practice. After all, given the
common-origin restrictions in Web browsers, the people using XQIB on the
XML documents in the browser are presumably the publishers of the XML,
so if it doesn't work for them, they'll fix it, no??

Liam

-- 
Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/
Pictures from old books: http://fromoldbooks.org/

Received on Sunday, 20 November 2011 01:31:47 UTC