- From: Rafael Weinstein <rafaelw@google.com>
- Date: Wed, 16 May 2012 16:29:28 -0700
- To: Yehuda Katz <wycats@gmail.com>
- Cc: Henri Sivonen <hsivonen@iki.fi>, Webapps WG <public-webapps@w3.org>, "Tab Atkins Jr." <jackalmage@gmail.com>, Scott González <scott.gonzalez@gmail.com>
Ok. I think I'm convinced on all points. I've uploaded a webkit patch which implements what we've agreed on here: https://bugs.webkit.org/show_bug.cgi?id=84646 I'm happy to report that this patch is nicer than the queued-token approach. Good call, Henri. On Tue, May 15, 2012 at 9:39 PM, Yehuda Katz <wycats@gmail.com> wrote: > > Yehuda Katz > (ph) 718.877.1325 > > > On Tue, May 15, 2012 at 6:46 AM, Henri Sivonen <hsivonen@iki.fi> wrote: >> >> On Fri, May 11, 2012 at 10:04 PM, Rafael Weinstein <rafaelw@google.com> >> wrote: >> > Issue 1: How to handle tokens which precede the first start tag >> > >> > Options: >> > a) Queue them, and then later run them through tree construction once >> > the implied context element has been picked >> > >> > b) Create a new insertion like "waiting for context element", which >> > probably ignores end tags and doctype and inserts character tokens and >> > comments. Once the implied context element is picked, reset the >> > insertion mode appropriately, and procede normally. >> >> I prefer b). > > > I like b as well. I assume it means that the "waiting for context element" > insertion mode would keep scanning until the ambiguity was resolved, and > then enter the appropriate insertion mode. Am I misunderstanding? I think what Yehuda is getting at here is that there are a handful of tags which are allowed to appear anywhere, so it doesn't make sense to "resolve the ambiguity" based on their identity. I talked with Tab about this, and happily, that set seems to be <style>, <script>, <meta>, & <link>. Happily, because this means that the new "ImpliedContext" insertion mode can handle start tags as follows (code from the above patch) if (token.name() == styleTag || token.name() == scriptTag || token.name() == metaTag || token.name() == linkTag) { processStartTagForInHead(token); // "process following the rules for the "in head" insertion mode" return; } m_fragmentContext.setContextTag(getImpliedContextTag(token.name())); "set the context element" resetInsertionModeAppropriately(); "reset the insertion mode appropriately" processStartTag(token); // "reprocess the token" > >> >> >> I'm assuming the use case for this stuff isn't that authors throw >> random stuff at the API and then insert the result somewhere. I expect >> authors to pass string literals or somewhat cooked string literals to >> the API knowing where they're going to insert the result but not >> telling the insertion point to the API as a matter of convenience. >> >> If you know you are planning to insert stuff as a child of tbody, >> don't start your string literal with stuff that would tokenize as >> characters! >> >> (Firefox currently does not have the capability to queue tokens. >> Speculative parsing in Firefox is not based on queuing tokens. See >> https://developer.mozilla.org/en/Gecko/HTML_parser_threading for the >> details.) >> >> > Issue 2: How to infer a non-HTML implied context element >> > >> > Options: >> > a) By tagName alone. When multiple namespaces match, prefer HTML, and >> > then either SVG or MathML (possibly on a per-tagName basis) >> > >> > b) Also inspect attributes for tagNames which may be in multiple >> > namespaces >> >> AFAICT, the case where this really matters (if my assumptions about >> use cases are right) is <a>. (Fragment parsing makes scripts useless >> anyway by setting their "already started" flag, authors probably >> shouldn't be adding styles by parsing <style>, both HTML and SVG >> <font> are considered harmful and cross-browser support Content MathML >> is far off in the horizon.) >> >> So I prefer a) possibly with <a>-specific elaborations if we can come >> up with some. Generic solutions seem to involve more complexity. For >> example, if we supported a generic attribute for forcing SVG >> interpretation, would it put us on a slippery slope to support it when >> it appears on tokens that aren't the first start tag token in a >> contextless fragment parse? >> >> > Issue 3: What form does the API take >> > >> > a) Document.innerHTML >> > >> > b) document.parse() >> > >> > c) document.createDocumentFragment() >> >> I prefer b) because: >> * It doesn't involve creating the fragment as a separate step. >> * It doesn't need to be foolishly consistent with the HTML vs. XML >> design errors of innerHTML. >> * It's shorted than document.createDocumentFragment(). >> * Unlike innerHTML, it is a method, so we can add more arguments >> later (or right away) to refine its behavior. >> >> -- >> Henri Sivonen >> hsivonen@iki.fi >> http://hsivonen.iki.fi/ > >
Received on Wednesday, 16 May 2012 23:29:58 UTC