- From: Simon Pieters <simonp@opera.com>
- Date: Fri, 04 Oct 2013 13:14:14 +0200
- To: "Michael[tm] Smith" <mike@w3.org>
- Cc: "Henri Sivonen" <hsivonen@iki.fi>, "www-archive@w3.org" <www-archive@w3.org>
On Fri, 04 Oct 2013 08:31:15 +0200, Michael[tm] Smith <mike@w3.org> wrote: > Hi Simon, > > I'd be game for taking at shot at implementing this as an additional > parser > mode, if Henri thinks it's a good idea. Cool. I noticed some changes below are not necessary. It seems that v.nu in streaming mode already violates the spec when it comes to comments after </body>. http://qa-dev.w3.org:8888/parsetree/?parser=html5&content=<%21doctype+html><body><%2Fbody><%21---->+&submit=Print+Tree We can take the same approach with comment after </head> -- just insert it in head. >> Needs more tweaking around frameset if checking frameset documents is >> desired. Looks like the missing piece is just handling a comment in after after frameset. >> http://www.whatwg.org/specs/web-apps/current-work/multipage/tree-construction.html#the-after-head-insertion-mode >> >> A character token that is one of U+0009 CHARACTER TABULATION, U+000A >> LINE FEED (LF), U+000C FORM FEED (FF), U+000D CARRIAGE RETURN (CR), >> or U+0020 SPACE >> - Insert the character. >> + Ignore the token. Revert this (it would insert to head, which is fine). >> A comment token >> - Insert a comment. >> + Ignore the token. Revert this (it would insert to head, which is fine). >> A start tag whose tag name is "html" >> - Process the token using the rules for the "in body" insertion mode. >> + Ignore the token. Revert this (in body would also ignore the token). >> A character token that is one of U+0009 CHARACTER TABULATION, U+000A >> LINE FEED (LF), U+000C FORM FEED (FF), U+000D CARRIAGE RETURN (CR), or >> U+0020 SPACE >> - Process the token using the rules for the "in body" insertion mode. >> + Ignore the token. Revert this. >> A comment token >> - Insert a comment as the last child of the first element in the stack >> - of open elements (the html element). >> + Ignore the token. Instead: + Process the token using the rules for the "in body" insertion mode. >> http://www.whatwg.org/specs/web-apps/current-work/multipage/tree-construction.html#the-after-after-body-insertion-mode >> >> A comment token >> - Insert a comment as the last child of the Document object. >> + Ignore the token. Instead: + Process the token using the rules for the "in body" insertion mode. >> A DOCTYPE token >> - A character token that is one of U+0009 CHARACTER TABULATION, U+000A >> - LINE FEED (LF), U+000C FORM FEED (FF), U+000D CARRIAGE RETURN (CR), >> - or U+0020 SPACE >> A start tag whose tag name is "html" >> Process the token using the rules for the "in body" insertion mode. >> >> + A character token that is one of U+0009 CHARACTER TABULATION, U+000A >> + LINE FEED (LF), U+000C FORM FEED (FF), U+000D CARRIAGE RETURN (CR), >> + or U+0020 SPACE >> + Ignore the token. Revert this. So, new version, doing the above and fixing frameset: http://www.whatwg.org/specs/web-apps/current-work/multipage/tree-construction.html#parsing-main-inhead An end tag whose tag name is "head" - Pop the current node (which will be the head element) off the stack of open elements. Anything else - Pop the current node (which will be the head element) off the stack of - open elements. http://www.whatwg.org/specs/web-apps/current-work/multipage/tree-construction.html#the-after-head-insertion-mode A start tag whose tag name is "body" + Pop the current node (which will be the head element) off the stack of + open elements. A start tag whose tag name is "frameset" + Pop the current node (which will be the head element) off the stack of + open elements. A start tag whose tag name is one of: "base", "basefont", "bgsound", "link", "meta", "noframes", "script", "style", "template", "title" Parse error. - Push the node pointed to by the head element pointer onto the stack of - open elements. Process the token using the rules for the "in head" insertion mode. - Remove the node pointed to by the head element pointer from the stack - of open elements. (It might not be the current node at this point.) Anything else + Pop the current node (which will be the head element) off the stack of + open elements. http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing.html#reconstruct-the-active-formatting-elements - 1. If there are no entries in the list of active formatting elements, - then there is nothing to reconstruct; stop this algorithm. + 1. Stop this algorithm. (This isn't necessary for streaming, but is nice for not flooding errors about a typoed formatting end tag.) http://www.whatwg.org/specs/web-apps/current-work/multipage/tree-construction.html#parsing-main-inbody A start tag whose tag name is "html" Parse error. - If there is a template element on the stack of open elements, then - ignore the token. - Otherwise, for each attribute on the token, check to see if the - attribute is already present on the top element of the stack of open - elements. If it is not, add the attribute and its corresponding value - to that element. + Ignore the token. A start tag whose tag name is "body" Parse error. If the second element on the stack of open elements is not a body element, if the stack of open elements has only one node on it, or if there is a template element on the stack of open elements, then ignore the token. (fragment case) - Otherwise, set the frameset-ok flag to "not ok"; then, for each - attribute on the token, check to see if the attribute is already - present on the body element (the second element) on the stack of open - elements, and if it is not, add the attribute and its corresponding - value to that element. + Ignore the token. A start tag whose tag name is "frameset" Parse error. - If the stack of open elements has only one node on it, or if the - second element on the stack of open elements is not a body element, - then ignore the token. (fragment case) - If the frameset-ok flag is set to "not ok", ignore the token. - Otherwise, run the following steps: - Remove the second element on the stack of open elements from its - parent node, if it has one. - Pop all the nodes from the bottom of the stack of open elements, from - the current node up to, but not including, the root html element. - Insert an HTML element for the token. - Switch the insertion mode to "in frameset". + Ignore the token. http://www.whatwg.org/specs/web-apps/current-work/multipage/tree-construction.html#adoption-agency-algorithm - 2. Let outer loop counter be zero. + 2. Stop this algorithm. http://www.whatwg.org/specs/web-apps/current-work/multipage/tree-construction.html#foster-parent - 7. Let adjusted insertion location be inside previous element, after - its last child (if any). + 7. Let adjusted insertion location be inside target, after its last + child (if any). http://www.whatwg.org/specs/web-apps/current-work/multipage/tree-construction.html#parsing-main-afterbody A comment token - Insert a comment as the last child of the first element in the stack - of open elements (the html element). + Process the token using the rules for the "in body" insertion mode. http://www.whatwg.org/specs/web-apps/current-work/multipage/tree-construction.html#the-after-after-body-insertion-mode A comment token - Insert a comment as the last child of the Document object. + Process the token using the rules for the "in body" insertion mode. http://www.whatwg.org/specs/web-apps/current-work/multipage/tree-construction.html#the-after-after-frameset-insertion-mode A comment token - Insert a comment as the last child of the Document object. + Process the token using the rules for the "in body" insertion mode. -- Simon Pieters Opera Software
Received on Friday, 4 October 2013 11:14:48 UTC