- From: Anne van Kesteren <annevk@opera.com>
- Date: Wed, 11 Feb 2009 11:08:32 +0100
- To: "Henry S. Thompson" <ht@inf.ed.ac.uk>
- Cc: "David Orchard" <orchard@pacificspirit.com>, "Henri Sivonen" <hsivonen@iki.fi>, www-tag@w3.org
On Tue, 10 Feb 2009 21:26:46 +0100, Henry S. Thompson <ht@inf.ed.ac.uk> wrote: > Anne van Kesteren writes: >> I think that if you want to allow arbitrary tree-based markup >> languages your only option is using XML. If you want them to be >> usable by authors as well you need something like XML5 > > (Let me start by emphasising that in what follows I'm not being > critical of Anne for designing and implementing XML5, it was an > interesting experiment.) > > But I think the world has already voted with its feet on the XML5 > question, in that there is a notable _lack_ of folk advocating it. Yeah, it seems most people are ok with just advancing HTML. > And there's good reason for that: XML actually _is_ usable by > authors and authoring well-formed XML is _not_ hard. Sure, until you start dealing with anything slightly more complex. E.g. trying to write blog software that accepts user input, input from other sites, etc. >> because even the experts fail: >> >> http://diveintomark.org/archives/2004/01/14/thought_experiment >> http://diveintomark.org/archives/2008/03/09/no-fury-like-dracon-scorned >> http://annevankesteren.nl/2009/01/xml-sunday > > That's one article which a) confuses validity with well-formedness and At the time of writing the validator links did point out actual well-formedness errors and some of those links still do. I don't think that naming it "invalid XML" should distract from the overall point of the article. > b) points to a piece of broken _software_; one article which reports > on one instance of HTML->XHTML upgrade failure (reading between the > lines); one article that points to a page in which someone trying to > introduce an _intentional_ markup error made the wrong error. Hardly > a compelling set of evidence that well-formed XML is too hard for > ordinary mortals. The point is that writing robust software is apparently not that easy. A subsequent point would be that writing robust software for HTML or XML5 is not a requirement, and that they would work well for the end user regardless of whether the software contains a serialization bug or not. > I did a quick (less so than I'd hoped -- the era of free access to > well-parameterised Web Search APIs appears to be over) web search, > which yielded 48 .xml documents. Of these > > 1 was ill-formed (said it was UTF-8, but had a Latin-1 character in > it. Intriguingly, it was served with _no_ > Content-encoding header) > 1 was unretrievable > 1 used a character encoding I couldn't immediately find a parser for > 45 were well-formed. > > Conveniently, that gives us the exact opposite of Ian Hickson's > oft-cited 97% broken HTML figure: we have 97% well-formed XML. That a bunch of standalone XML documents is well-formed is hardly surprising imo. I do not think that is the interesting case. > So whatever else may be still be discussed, I do not think there's > much if any evidence of either demand or need for an "XML5". I was just answering a question from David on whether such a thing existed and whether it could be reconciled with HTML parsing. I'm fine with people just using HTML instead. -- Anne van Kesteren <http://annevankesteren.nl/> <http://www.opera.com/>
Received on Wednesday, 11 February 2009 10:09:44 UTC