W3C home > Mailing lists > Public > www-tag@w3.org > February 2009

Re: HTML and XML

From: Elliotte Harold <elharo@metalab.unc.edu>
Date: Wed, 11 Feb 2009 06:53:56 -0800
Message-ID: <4992E684.8060602@metalab.unc.edu>
To: Henri Sivonen <hsivonen@iki.fi>
Cc: "Henry S.Thompson" <ht@inf.ed.ac.uk>, Anne van Kesteren <annevk@opera.com>, David Orchard <orchard@pacificspirit.com>, www-tag@w3.org

Henri Sivonen wrote:

> So far Philip Taylor (the author of 
> http://lists.w3.org/Archives/Public/www-archive/2009Feb/0058.html ) has 
> found well-formedness holes in every XML-outputting system he has cared 
> to try.
> 
> He even managed to make Validator.nu produce ill-formed output. The bug 
> was in the Xalan serializer--a widely distributed library written by 
> experts. (Astral characters were serialized as two numeric character 
> references for the corresponding surrogates.)

Perhaps he'd care to take a whack at XOM one of these days?

I do agree that the state of XML serialization is rather pathetic, 
though. XML is more complex than it appears and the amount of bad XML 
generating and escaping code out there is a problem. I tend to think the 
response is better libraries, and perhaps integrating some checks into 
staic analysis tools.

-- 
Elliotte Rusty Harold  elharo@metalab.unc.edu
Refactoring HTML Just Published!
http://www.amazon.com/exec/obidos/ISBN=0321503635/ref=nosim/cafeaulaitA
Received on Wednesday, 11 February 2009 14:54:32 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 12:48:12 GMT