Re: Comments on HTML WG face to face meetings in France Oct 08 from Maciej Stachowiak on 2008-11-18 (www-tag@w3.org from November 2008)

From: Maciej Stachowiak <mjs@apple.com>
Date: Mon, 17 Nov 2008 17:34:08 -0800
To: elharo@metalab.unc.edu
Cc: Jonas Sicking <jonas@sicking.cc>, "Henry S. Thompson" <ht@inf.ed.ac.uk>, noah_mendelsohn@us.ibm.com, Dean Edridge <dean@dean.org.nz>, public-html <public-html@w3.org>, www-tag@w3.org
Message-id: <FDC474C0-A50A-43E2-9889-D7CC0FAEDB79@apple.com>

On Nov 17, 2008, at 7:21 AM, Elliotte Harold wrote:

> Maciej Stachowiak wrote:
>
>> And finally, in my experience, it is not necessarily even true that  
>> the error handling of HTML makes it harder to implement parsing  
>> than for XML. In WebKit, the pieces of code implementing HTML and  
>> XML parsing are close to the same size, and that is not even  
>> including the libxml library that does most of the heavy lifting in  
>> XML parsing.
>
> But is WebKit already fully implementing the HTML 5 parsing  
> algorithm? That is what I suspect is going to complexify parsers  
> beyond plausibility.

We don't fully implement the HTML5 parsing algorithm, but I believe  
that when we do, it will make our parsing code simpler rather than  
more complex. The HTML5 parsing algorithm makes a lot of things  
systematic that are messy ad-hoc cases in our current parsing code. As  
an example of this, we do have a pretty much HTML5-compatible  
tokenizer for our preload scanning, and it is a lot cleaner than the  
main tokenizer.

Part of my excitement about the HTML5 parsing algorithm, as a browser  
implementor, is that it will simplify our parsing code while also  
making it more compatible with content and more interoperable with  
other browsers. If I thought it would be a lot more complicated than  
our current code, I would probably be opposed to implementing it.

Regards,
Maciej

Received on Tuesday, 18 November 2008 01:34:51 UTC