Re: Comments on HTML WG face to face meetings in France Oct 08

On Nov 17, 2008, at 7:21 AM, Elliotte Harold wrote:

> Maciej Stachowiak wrote:
>
>> And finally, in my experience, it is not necessarily even true that  
>> the error handling of HTML makes it harder to implement parsing  
>> than for XML. In WebKit, the pieces of code implementing HTML and  
>> XML parsing are close to the same size, and that is not even  
>> including the libxml library that does most of the heavy lifting in  
>> XML parsing.
>
> But is WebKit already fully implementing the HTML 5 parsing  
> algorithm? That is what I suspect is going to complexify parsers  
> beyond plausibility.

We don't fully implement the HTML5 parsing algorithm, but I believe  
that when we do, it will make our parsing code simpler rather than  
more complex. The HTML5 parsing algorithm makes a lot of things  
systematic that are messy ad-hoc cases in our current parsing code. As  
an example of this, we do have a pretty much HTML5-compatible  
tokenizer for our preload scanning, and it is a lot cleaner than the  
main tokenizer.

Part of my excitement about the HTML5 parsing algorithm, as a browser  
implementor, is that it will simplify our parsing code while also  
making it more compatible with content and more interoperable with  
other browsers. If I thought it would be a lot more complicated than  
our current code, I would probably be opposed to implementing it.

Regards,
Maciej

Received on Tuesday, 18 November 2008 01:34:51 UTC