Re: HTML and XML

I wrote wrote:

> So far Philip Taylor (the author of 
>  ) has found well-formedness holes in every XML-outputting system he  
> has cared to try.
> He even managed to make produce ill-formed output. The  
> bug was in the Xalan serializer--a widely distributed library  
> written by experts. (Astral characters were serialized as two  
> numeric character references for the corresponding surrogates.)
> I can brag that Philip hasn't found an ill-formedness-inducing bug  
> in any XML serialization code written entirely by me. However, he  
> has still found *a* bug (not ill-formedness-inducing one) in my XML  
> serializer, too. (I replaced the Xalan serializer with one that I  
> wrote myself.)

Sure enough, with this incitement, Philip found a sample application I  
had released with my serializer and managed to get it (though not itself) produce ill-formed output. How? I was relying on  
the JAXP-supplied SAX2 parser to honor its end of the SAX2 API  
contract as it applies to XML 1.0 (4th ed. and earlier). However,  
Philip fed my app XML 1.1 which the JAXP-provided parser (Xerces2 in  
this case) failed to reject thereby allowing bad SAX events to enter  
the pipeline (specifically, a namespace mapping that mapped a prefix  
to the empty string).

Philip also found a hole in XOM:

Henri Sivonen

Received on Thursday, 19 February 2009 08:41:35 UTC