- From: Henry S. Thompson <ht@inf.ed.ac.uk>
- Date: Tue, 10 Feb 2009 20:26:46 +0000
- To: "Anne van Kesteren" <annevk@opera.com>
- Cc: "David Orchard" <orchard@pacificspirit.com>, "Henri Sivonen" <hsivonen@iki.fi>, www-tag@w3.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 There are a number of different issues which have emerged, been discussed, and put to one side during this thread, and in _this_ message I only want to reply to the most recent one, which in turn is actually only one part of Anne's post: Anne van Kesteren writes: > I think that if you want to allow arbitrary tree-based markup > languages your only option is using XML. If you want them to be > usable by authors as well you need something like XML5 (Let me start by emphasising that in what follows I'm not being critical of Anne for designing and implementing XML5, it was an interesting experiment.) But I think the world has already voted with its feet on the XML5 question, in that there is a notable _lack_ of folk advocating it. And there's good reason for that: XML actually _is_ usable by authors and authoring well-formed XML is _not_ hard. > because even the experts fail: > > http://diveintomark.org/archives/2004/01/14/thought_experiment > http://diveintomark.org/archives/2008/03/09/no-fury-like-dracon-scorned > http://annevankesteren.nl/2009/01/xml-sunday That's one article which a) confuses validity with well-formedness and b) points to a piece of broken _software_; one article which reports on one instance of HTML->XHTML upgrade failure (reading between the lines); one article that points to a page in which someone trying to introduce an _intentional_ markup error made the wrong error. Hardly a compelling set of evidence that well-formed XML is too hard for ordinary mortals. I did a quick (less so than I'd hoped -- the era of free access to well-parameterised Web Search APIs appears to be over) web search, which yielded 48 .xml documents. Of these 1 was ill-formed (said it was UTF-8, but had a Latin-1 character in it. Intriguingly, it was served with _no_ Content-encoding header) 1 was unretrievable 1 used a character encoding I couldn't immediately find a parser for 45 were well-formed. Conveniently, that gives us the exact opposite of Ian Hickson's oft-cited 97% broken HTML figure: we have 97% well-formed XML. So whatever else may be still be discussed, I do not think there's much if any evidence of either demand or need for an "XML5". ht - -- Henry S. Thompson, School of Informatics, University of Edinburgh Half-time member of W3C Team 10 Crichton Street, Edinburgh EH8 9AB, SCOTLAND -- (44) 131 650-4440 Fax: (44) 131 651-1426, e-mail: ht@inf.ed.ac.uk URL: http://www.ltg.ed.ac.uk/~ht/ [mail really from me _always_ has this .sig -- mail without it is forged spam] -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.6 (GNU/Linux) iD8DBQFJkeMNkjnJixAXWBoRAvDEAJwKtHxpiqDi3kk7UO3F9ut8IHhCRQCfdry/ jb5pWVu+SVRU++lEVVDzfE4= =bNhm -----END PGP SIGNATURE-----
Received on Tuesday, 10 February 2009 20:28:11 UTC