[whatwg] Another bug in the HTML parsing spec? from Henri Sivonen on 2011-12-20 (public-whatwg-archive@w3.org from December 2011)

From: Henri Sivonen <hsivonen@iki.fi>
Date: Tue, 20 Dec 2011 16:14:25 +0200
Message-ID: <CAJQvAue2UyQ77sRG86Q2ATVdtbMZbnRNyoYkhMP-fK9187JpLQ@mail.gmail.com>

On Tue, Oct 18, 2011 at 3:47 AM, Ian Hickson <ian at hixie.ch> wrote:

>> 2) I can't get all of the parser tests from html5lib to pass with this
>> algorithm as it is currently written. ?In particular, there are 5 tests in
>> testdata/tree-construction/tests9.dat of this basic form:
>>
>> <!DOCTYPE html><body><table><math><mi>foo</mi></math></table>
>>
>> As the spec is written, the <mi> tag is a text integration point, so the "foo"
>> text token is handled like regular content, not like foreign content.
>
> Oh, my, yeah, that's all kinds of wrong. The text node should be handled
> as if it was in the "in body" mode, not as if it was "in table". I'll have
> to study this closer.
>
> I think this broke when we moved away from using an insertion mode for
> foreign content.
>
> Henri, do you know how Gecko gets this right currently?

The tree builder in Gecko always uses an accumulation buffer that gets
flushed when the tree builder sees and end tag token or a start tag
token.

-- 
Henri Sivonen
hsivonen at iki.fi
http://hsivonen.iki.fi/

Received on Tuesday, 20 December 2011 06:14:25 UTC