Re: Comments on HTML WG face to face meetings in France Oct 08 from Elliotte Harold on 2008-11-17 (www-tag@w3.org from November 2008)

From: Elliotte Harold <elharo@metalab.unc.edu>
Date: Mon, 17 Nov 2008 07:31:49 -0800
To: Ian Hickson <ian@hixie.ch>
Cc: "Henry S. Thompson" <ht@inf.ed.ac.uk>, noah_mendelsohn@us.ibm.com, public-html <public-html@w3.org>, www-tag@w3.org
Message-ID: <49218E65.70906@metalab.unc.edu>

Ian Hickson wrote:

> How is this different from what HTML4 did? HTML4 said "this is what is 
> valid, and everything else should work too". And the browsers by and large 
> did this, in an interoperable fashion (at great cost, and in a manner that 
> made it very hard to enter the market). How does this differ from HTML5's 
> approach, other than HTML5 making competition easier?

HTML 4 enabled parsers to defined their own error recovery. HTML 4 
requires specific error recovery.

>> it very much raises the bar for implementing parsers
> 
> This is demonstrably false, in that there are more interoperable HTML5 
> parsers today, before the spec is even finished, than there have ever been 
> interoperable HTML4 parsers. Even for valid documents of each.

Greater than zero (or perhaps one--can there be a single interoperable 
parser?) is not a very high bar to hurdle.

> Absolutely. XML's approach has utterly failed on the Web (q.v. the 
> universal feed parser for RSS and Atom). It would be amateurish of us to 
> keep following this model after what we have learnt over the past ten 
> years. We have a responsibility to the Web to do better.

There are reasons for that, mostly due to mistakes the W3C made in the 
development of HTML. They pushed a syntax change without compensating 
features to make the syntax changes worthwhile to implementers and 
users. HTML 5 makes the opposite mistake: it's only pushing features 
with no syntax changes. This seems likely to cause other problems.

> Also, I think it's pushing the truth a bit to say that draconian error 
> handling is a core value of XML. The XML working group was quite split on 
> the issue. [1]

They were split but draconian error handling won.

>> It makes the spec far harder to understand and implement.
> 
> Half of the error handling is almost implicit, in that the algorithm that 
> says what you have to do just handles all cases without needing to be 
> explicit. So that's not harder to understand. The other half might be 
> somewhat more involved than ignoring error cases, but, well, tough. We're 
> not making toast here, we're trying to define one of the most important 
> platforms that humanity has ever used. If it's a little harder to 
> understand, sobeit.

Straw man. I am not suggesting that one ignore error cases. I am simply 
suggesting that one might wish to report them and indicate them as such, 
rather than defining them out of existence. HTML 5 error handling is 
much harder to implement than draconian error handling that refuses to 
parse or display malformed documents. Is the additional difficulty worth 
it? I'm not sure? Is the HTML 5 spec actually clear and unambiguous 
enough to achieve that goal? Maybe, but I've learned to be cautious 
about such ambitious goals.
-- 
Elliotte Rusty Harold  elharo@metalab.unc.edu
Refactoring HTML Just Published!
http://www.amazon.com/exec/obidos/ISBN=0321503635/ref=nosim/cafeaulaitA

Received on Monday, 17 November 2008 15:32:26 UTC