W3C home > Mailing lists > Public > www-archive@w3.org > November 2009

Re: XHTML character entity support

From: John Cowan <cowan@ccil.org>
Date: Wed, 25 Nov 2009 10:10:21 -0500
To: Simon Pieters <simonp@opera.com>
Cc: John Cowan <cowan@ccil.org>, Ian Hickson <ian@hixie.ch>, www-archive@w3.org
Message-ID: <20091125151021.GA19821@mercury.ccil.org>
Simon Pieters scripsit:

> Because of things like attributes on stray <html> tags affecting  
> attributes on the root element, a streaming parser sometimes either has to  
> abort, emit non-SAX events or violate HTML5.

TagSoup never aborts (except on I/O errors) and it would be useless
if it produced SAX events that didn't conform to XML.  So, as I say,
it doesn't guarantee adherence to any particular schema.

There is also the fourth "option" of going into an infinite loop.
HTML Tidy used to choose this option quite frequently, apparently because
a pair of fix-up rules were applied repeatedly, changing the tree from
A to B to A to B ....  TagSoup's design makes this particular flavor of
bug impossible.  (Of course there have been, and are, other bugs.)

-- 
John Cowan                                cowan@ccil.org
At times of peril or dubitation,          http://www.ccil.org/~cowan
Perform swift circular ambulation,
With loud and high-pitched ululation.
Received on Wednesday, 25 November 2009 15:10:56 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 14:43:36 UTC