Re: Comments on HTML WG face to face meetings in France Oct 08

Elliotte Harold wrote:
> HTML 4 enabled parsers to defined their own error recovery. HTML 4 
> requires specific error recovery.

You must have meant HTML 5 for the second occurrence of "HTML 4"?  Error 
recovery for the win!  ;)

But yes, this is precisely the change from HTML 4.  This change makes it 
_easier_ to implement a parser.  This change is one of the major reasons 
the HTML 5 effort exists.

> Greater than zero (or perhaps one--can there be a single interoperable 
> parser?) is not a very high bar to hurdle.

While true, the point is that with the HTML4 approach it's nearly 
impossible to have more than zero interoperable parsers.

> There are reasons for that, mostly due to mistakes the W3C made in the 
> development of HTML. They pushed a syntax change without compensating 
> features to make the syntax changes worthwhile to implementers and 
> users. HTML 5 makes the opposite mistake: it's only pushing features 
> with no syntax changes. This seems likely to cause other problems.

Can you name some?  Realistically, though, syntax changes for the sake 
of syntax changes are bad.  If they can be avoided, so much the better.

> They were split but draconian error handling won.

I should note that what XML has is draconian error recovery (or rather 
lack of it), not draconian error handling.  Error handling in XML is, as 
you repeatedly say, undefined.

> HTML 5 error handling is 
> much harder to implement than draconian error handling that refuses to 
> parse or display malformed documents

Note that XML does not have such error handling.

> Is the additional difficulty worth it?

Yes.  ;)

> Is the HTML 5 spec actually clear and unambiguous 
> enough to achieve that goal?

_That_ is a very good question.  I think it is (simply by virtue of 
being a state machine; it should be possible to automatically verify 
that all transitions for all inputs in all states are defined; once 
that's done, the spec is certainly unambiguous).

-Boris

Received on Monday, 17 November 2008 15:52:25 UTC