Re: HTML and XML from Julian Reschke on 2009-02-11 (www-tag@w3.org from February 2009)

From: Julian Reschke <julian.reschke@gmx.de>
Date: Wed, 11 Feb 2009 08:42:05 +0100
To: "Michael(tm) Smith" <mike@w3.org>
CC: "Henry S. Thompson" <ht@inf.ed.ac.uk>, Anne van Kesteren <annevk@opera.com>, David Orchard <orchard@pacificspirit.com>, Henri Sivonen <hsivonen@iki.fi>, www-tag@w3.org
Message-ID: <4992814D.1040903@gmx.de>

Michael(tm) Smith wrote:
> ...
> Regarding the world voting with its feet: As far as the Web goes
> at least, it would seem that it's instead really been a matter of
> the vast majority of content providers voting against XML/XHTML
> completely and voting for HTML instead (by choosing to serve
> non-WF HTML, and by choosing to serve XHTML as text/html so that
> it gets processed by HTML parsers in browsers instead of by XML
> parsers in browsers).
> ...

I think another motivation (and probably the bigger one) is that they 
want their content to be processed by IE.

> ...
> To pick one: A key problem case that's been cited many times is
> the case of making sure that a part your site is not going to
> become completely inaccessible just because one user comes in and
> inserts a comment with some malformed markup instance in it that
> their scrubber was not configured to deal with. And if you say
> it's not hard to anticipate the possible errors and catch them...
> well, I'd have to say I know a few very sharp people who have
> found otherwise.
> ...

Yes, a system that emits XML must use the right tools. "Parsing" user 
input with regular expressions etc and then embedding the result 
verbatim into the output is going to break. Don't do it. Use the 
libraries that have been designed for this.

BR, Julian

Received on Wednesday, 11 February 2009 07:42:53 UTC