W3C home > Mailing lists > Public > www-archive@w3.org > February 2009

Re: HTML and XML

From: Philip Taylor <pjt47@cam.ac.uk>
Date: Wed, 11 Feb 2009 00:04:05 +0000
Message-ID: <499215F5.6030502@cam.ac.uk>
To: "Henry S. Thompson" <ht@inf.ed.ac.uk>
CC: www-archive@w3.org

Henry S. Thompson wrote:
> Anne van Kesteren writes:
>> because even the experts fail:
>>
>>   http://diveintomark.org/archives/2004/01/14/thought_experiment
>>   http://diveintomark.org/archives/2008/03/09/no-fury-like-dracon-scorned
>>   http://annevankesteren.nl/2009/01/xml-sunday
> 
> That's one article which a) confuses validity with well-formedness and
> b) points to a piece of broken _software_; one article which reports
> on one instance of HTML->XHTML upgrade failure (reading between the
> lines); one article that points to a page in which someone trying to
> introduce an _intentional_ markup error made the wrong error.  Hardly
> a compelling set of evidence that well-formed XML is too hard for
> ordinary mortals.

See also the comments in the second article, particularly 
<http://diveintomark.org/archives/2008/03/09/no-fury-like-dracon-scorned#comment-11442>. 
When people write dynamic web sites that accept user input and reflect 
it in their XML output, the evidence indicates they always have holes 
that allow the user to make the output ill-formed. When that's something 
like a comment system, or a wiki page, or search queries that are 
displayed in admin logs, it can prevent other users accessing the site.

(Often those holes are XSS vulnerabilities and affect HTML too, but 
often they're harmless in HTML and are only an issue because of XML's 
relatively complex character restrictions.)

-- 
Philip Taylor
pjt47@cam.ac.uk
Received on Wednesday, 11 February 2009 00:04:46 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 7 November 2012 14:18:21 GMT