W3C home > Mailing lists > Public > www-tag@w3.org > February 2009

Re: HTML and XML

From: Henri Sivonen <hsivonen@iki.fi>
Date: Thu, 19 Feb 2009 10:40:49 +0200
Cc: "Anne van Kesteren" <annevk@opera.com>, "David Orchard" <orchard@pacificspirit.com>, www-tag@w3.org
Message-Id: <81B8A97C-A774-4453-BDE4-D3B42F0D55A3@iki.fi>
To: Henry S.Thompson <ht@inf.ed.ac.uk>
I wrote wrote:

> So far Philip Taylor (the author of http://lists.w3.org/Archives/Public/www-archive/2009Feb/0058.html 
>  ) has found well-formedness holes in every XML-outputting system he  
> has cared to try.
>
> He even managed to make Validator.nu produce ill-formed output. The  
> bug was in the Xalan serializer--a widely distributed library  
> written by experts. (Astral characters were serialized as two  
> numeric character references for the corresponding surrogates.)
>
> I can brag that Philip hasn't found an ill-formedness-inducing bug  
> in any XML serialization code written entirely by me. However, he  
> has still found *a* bug (not ill-formedness-inducing one) in my XML  
> serializer, too. (I replaced the Xalan serializer with one that I  
> wrote myself.)

Sure enough, with this incitement, Philip found a sample application I  
had released with my serializer and managed to get it (though not  
Validator.nu itself) produce ill-formed output. How? I was relying on  
the JAXP-supplied SAX2 parser to honor its end of the SAX2 API  
contract as it applies to XML 1.0 (4th ed. and earlier). However,  
Philip fed my app XML 1.1 which the JAXP-provided parser (Xerces2 in  
this case) failed to reject thereby allowing bad SAX events to enter  
the pipeline (specifically, a namespace mapping that mapped a prefix  
to the empty string).

Philip also found a hole in XOM:
http://lists.w3.org/Archives/Public/www-archive/2009Feb/0062.html

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/
Received on Thursday, 19 February 2009 08:41:35 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 12:48:12 GMT