- From: Adam Barth <w3c@adambarth.com>
- Date: Mon, 12 Dec 2011 18:23:23 -0800
I'm trying to understand how the HTML parsing spec handles the following case: <!DOCTYPE html><body><table><math><mi>foo</mi></math></table> According to the html5lib test data, we should parse that as follows: | <!DOCTYPE html> | <html> | <head> | <body> | <math math> | <math mi> | "foo" | <table> However, I'm not sure whether that's what the spec actually does. Consider point at which we parse the "f" character token (from "foo"). The insertion mode will be "in table". The spec will execute as follows: -> If the current node is a MathML text integration point and the token is a character token * Process the token according to the rules given in the section corresponding to the current insertion mode in HTML content. -> A character token * Let the pending table character tokens be an empty list of tokens. * Let the original insertion mode be the current insertion mode. * Switch the insertion mode to "in table text" and reprocess the token. -> Any other character token * Append the character token to the pending table character tokens list. ... the "o" and "o" will be processed similarly and end up in the pending table character tokens list. Now, consider the </mi> token. We're still at a MathML text integration point, but the current token is neither a start token (with certain names) nor a character token, so we process the token according to the rules given in the section for parsing tokens in foreign content. -> Any other end tag * Run these steps: ... The net result of which is popping the stack of open elements, but not flushing out the pending table character tokens list. The list will eventually be flushed when we process the </table> token, resulting these character tokens getting foster parented: | <!DOCTYPE html> | <html> | <head> | <body> | <math math> | <math mi> | "foo" | <table> Thanks, Adam
Received on Monday, 12 December 2011 18:23:23 UTC