[whatwg] <!DOCTYPE html><body><table><math><mi>foo</mi></math></table> from Adam Barth on 2011-12-13 (public-whatwg-archive@w3.org from December 2011)

From: Adam Barth <w3c@adambarth.com>
Date: Mon, 12 Dec 2011 21:05:41 -0800
Message-ID: <CAJE5ia-Pjg0f+Zm=RKPdJrSUxOXO3r1-5H2wExsHaTT+Fw2xXQ@mail.gmail.com>

Yes, that's the same issue.  It appears to be fallout from removing
the "in foreign content" insertion mode.

Adam


On Mon, Dec 12, 2011 at 7:36 PM, David Flanagan <dflanagan at mozilla.com> wrote:
> I think this is the same problem I reported here: http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2011-October/033533.html
> See Hixie's response to that message. ?I think this is a known problem, though I don't know if a bug has been filed on it.
>
> ? ?David
>
> ----- Original Message -----
> From: "Adam Barth" <w3c at adambarth.com>
> To: "whatwg" <whatwg at lists.whatwg.org>
> Cc: "Henri Sivonen" <hsivonen at iki.fi>
> Sent: Monday, December 12, 2011 6:23:23 PM
> Subject: [whatwg] <!DOCTYPE ? ? html><body><table><math><mi>foo</mi></math></table>
>
> I'm trying to understand how the HTML parsing spec handles the following case:
>
> <!DOCTYPE html><body><table><math><mi>foo</mi></math></table>
>
> According to the html5lib test data, we should parse that as follows:
>
> | <!DOCTYPE html>
> | <html>
> | ? <head>
> | ? <body>
> | ? ? <math math>
> | ? ? ? <math mi>
> | ? ? ? ? "foo"
> | ? ? <table>
>
> However, I'm not sure whether that's what the spec actually does.
>
> Consider point at which we parse the "f" character token (from "foo").
> ?The insertion mode will be "in table". ?The spec will execute as
> follows:
>
> -> If the current node is a MathML text integration point and the
> token is a character token
> ?* Process the token according to the rules given in the section
> corresponding to the current insertion mode in HTML content.
>
> -> A character token
> ?* Let the pending table character tokens be an empty list of tokens.
> ?* Let the original insertion mode be the current insertion mode.
> ?* Switch the insertion mode to "in table text" and reprocess the token.
>
> -> Any other character token
> ?* Append the character token to the pending table character tokens list.
>
> ... the "o" and "o" will be processed similarly and end up in the
> pending table character tokens list.
>
> Now, consider the </mi> token. ?We're still at a MathML text
> integration point, but the current token is neither a start token
> (with certain names) nor a character token, so we process the token
> according to the rules given in the section for parsing tokens in
> foreign content.
>
> -> Any other end tag
> ?* Run these steps:
> ? ?...
>
> The net result of which is popping the stack of open elements, but not
> flushing out the pending table character tokens list. ?The list will
> eventually be flushed when we process the </table> token, resulting
> these character tokens getting foster parented:
>
> | <!DOCTYPE html>
> | <html>
> | ? <head>
> | ? <body>
> | ? ? <math math>
> | ? ? ? <math mi>
> | ? ? "foo"
> | ? ? <table>
>
> Thanks,
> Adam

Received on Monday, 12 December 2011 21:05:41 UTC