W3C home > Mailing lists > Public > www-tag@w3.org > February 2009

Re: HTML and XML

From: Julian Reschke <julian.reschke@gmx.de>
Date: Wed, 11 Feb 2009 08:42:05 +0100
Message-ID: <4992814D.1040903@gmx.de>
To: "Michael(tm) Smith" <mike@w3.org>
CC: "Henry S. Thompson" <ht@inf.ed.ac.uk>, Anne van Kesteren <annevk@opera.com>, David Orchard <orchard@pacificspirit.com>, Henri Sivonen <hsivonen@iki.fi>, www-tag@w3.org

Michael(tm) Smith wrote:
> ...
> Regarding the world voting with its feet: As far as the Web goes
> at least, it would seem that it's instead really been a matter of
> the vast majority of content providers voting against XML/XHTML
> completely and voting for HTML instead (by choosing to serve
> non-WF HTML, and by choosing to serve XHTML as text/html so that
> it gets processed by HTML parsers in browsers instead of by XML
> parsers in browsers).
> ...

I think another motivation (and probably the bigger one) is that they 
want their content to be processed by IE.

> ...
> To pick one: A key problem case that's been cited many times is
> the case of making sure that a part your site is not going to
> become completely inaccessible just because one user comes in and
> inserts a comment with some malformed markup instance in it that
> their scrubber was not configured to deal with. And if you say
> it's not hard to anticipate the possible errors and catch them...
> well, I'd have to say I know a few very sharp people who have
> found otherwise.
> ...

Yes, a system that emits XML must use the right tools. "Parsing" user 
input with regular expressions etc and then embedding the result 
verbatim into the output is going to break. Don't do it. Use the 
libraries that have been designed for this.

BR, Julian
Received on Wednesday, 11 February 2009 07:42:53 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 12:48:12 GMT