- From: Peter Occil <poccil14@gmail.com>
- Date: Thu, 1 Aug 2013 14:22:23 -0400
- To: "Ian Hickson" <ian@hixie.ch>
- Cc: WHATWG <whatwg@whatwg.org>
Many of these cases occur in the normative portion of the tree construction stage. Most of them involve checking whether an element (as opposed to a tag token) has a certain name: Accordingly, these cases are ambiguous: * If foster parenting is enabled and target is a table, tbody, tfoot, thead, or tr element * Let last template be the last template element in the stack of open elements, if any. * Let last table be the last table element in the stack of open elements, if any. * If the adjusted insertion location is inside a template element, let it instead be inside the template element's template contents [first instance only] * When the steps below require the UA to generate implied end tags, then, while the current node is a dd element, a dt element, an li element, an option element, an optgroup element, a p element, an rp element, or an rt element, the UA must pop the current node off the stack of open elements. * Create an html element whose ownerDocument is the Document object. [doesn't mention the namespace] * If there is no template element on the stack of open elements, then this is a parse error; ignore the token. * If the current node is not a template element, then this is a parse error. * Pop elements from the stack of open elements until a template element has been popped from the stack. * If there is a template element on the stack of open elements, ignore the token. * If the second element on the stack of open elements is not a body element, [...] or if there is a template element on the stack of open elements, then ignore the token. And more. But these cases aren't ambiguous: * [L]et adjusted insertion location be inside the first element in the stack of open elements (the html element) ... [explanatory only] * [I]t's possible for elements, the table element in this case in particular, to have been moved by a script around in the DOM ... [appears in a note] * [A]ssociate the newly created element with the form element pointed to by the form element pointer * Set the head element pointer to the newly created head element. * If the parser was originally created for the HTML fragment parsing algorithm, then mark the script element as "already started". * Pop the current node (which will be the head element) off the stack of open elements. [Appears twice] * Pop the current node (which will be a noscript element) from the stack of open elements; the new current node will be a head element. [Appears twice] * [F]or each attribute on the token, check to see if the attribute is already present on the body element (the second element) on the stack of open elements And more. As you can see, it's really only a few dozen ambiguous cases, not thousands. Plus they all seem to follow one of these patterns: * If the node is a so-and-so element * While the node is a so-and-so element, a such-and-such element, etc. * The last so-and-so element on the stack of open elements * If there is a so-and-so element on the stack of open elements * Until a so-and-so element has been popped from the stack * If the list of active formatting elements contains a so-and-so element * Have a so-and-so element in button scope, table scope, etc. (One exception is "Create an html element whose ownerDocument is the Document object.") Moreover, where needed, a shortcut is to use "an HTML so-and-so element" rather than "a so-and-so element in the HTML namespace". (This can apply similarly to SVG and MathML.) --Peter -----Original Message----- From: Ian Hickson Sent: Thursday, August 01, 2013 1:31 PM To: Peter Occil Cc: WHATWG Subject: Re: Namespaces and tag names in the HTML parser On Wed, 10 Jul 2013, Peter Occil wrote: > > > > Short of explicitly putting "in the HTML namespace" at every > > occurrence of this, I don't know how to fix this. Putting "in the HTML > > namespace" everywhere is a non-starter, there's something like ten > > thousand occurrences of element names in the spec. (Literally. Ten > > thousand.) > > I don't mean in the entire HTML spec, I only mean within the tree > construction section, and then only where it eliminates ambiguity, such > as "while the current node is not a tr element or an html element", as I > stated previously. I agree it's silly to include the words "in the HTML > namespace" everywhere in the spec. I don't really understand why that case is ambiguous, but thousands of others aren't. Can you elaborate on what the difference is? -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Thursday, 1 August 2013 18:22:54 UTC