Re: Implied Context Parsing (DocumentFragment.innerHTML, or similar) proposal details to be sorted out from Tab Atkins Jr. on 2012-05-15 (public-webapps@w3.org from April to June 2012)

From: Tab Atkins Jr. <jackalmage@gmail.com>
Date: Tue, 15 May 2012 16:42:45 +0200
To: Henri Sivonen <hsivonen@iki.fi>
Cc: Rafael Weinstein <rafaelw@google.com>, Webapps WG <public-webapps@w3.org>, Yehuda Katz <wycats@gmail.com>, Scott González <scott.gonzalez@gmail.com>
Message-ID: <CAAWBYDB6NRz9WVzJQHME9=CMT_RwLP1E1DRAixuh=ZBndEgqFg@mail.gmail.com>

On Tue, May 15, 2012 at 12:46 PM, Henri Sivonen <hsivonen@iki.fi> wrote:
> On Fri, May 11, 2012 at 10:04 PM, Rafael Weinstein <rafaelw@google.com> wrote:
>> Issue 1: How to handle tokens which precede the first start tag
>>
>> Options:
>> a) Queue them, and then later run them through tree construction once
>> the implied context element has been picked
>>
>> b) Create a new insertion like "waiting for context element", which
>> probably ignores end tags and doctype and inserts character tokens and
>> comments. Once the implied context element is picked, reset the
>> insertion mode appropriately, and procede normally.
>
> I prefer b).
>
> I'm assuming the use case for this stuff isn't that authors throw
> random stuff at the API and then insert the result somewhere. I expect
> authors to pass string literals or somewhat cooked string literals to
> the API knowing where they're going to insert the result but not
> telling the insertion point to the API as a matter of convenience.

Exactly correct.  That's the jQuery use-case exactly, which we're
trying to solve.

I'm totally fine with b) as well.


>> Issue 2: How to infer a non-HTML implied context element
>>
>> Options:
>> a) By tagName alone. When multiple namespaces match, prefer HTML, and
>> then either SVG or MathML (possibly on a per-tagName basis)
>>
>> b) Also inspect attributes for tagNames which may be in multiple namespaces
>
> AFAICT, the case where this really matters (if my assumptions about
> use cases are right) is <a>. (Fragment parsing makes scripts useless
> anyway by setting their "already started" flag, authors probably
> shouldn't be adding styles by parsing <style>, both HTML and SVG
> <font> are considered harmful and cross-browser support Content MathML
> is far off in the horizon.)

Yup, your assumptions are correct as far as I know.

> So I prefer a) possibly with <a>-specific elaborations if we can come
> up with some. Generic solutions seem to involve more complexity. For
> example, if we supported a generic attribute for forcing SVG
> interpretation, would it put us on a slippery slope to support it when
> it appears on tokens that aren't the first start tag token in a
> contextless fragment parse?

That wouldn't make sense, though.  If you see "<p> foo <a svg><rect
/></a>", you're already in an HTML context, and the svg stuff has to
be wrapped in an <svg> to form the airtight namespace seal.

But still, @svg is kinda a hacky solution.  Shrug.


>> Issue 3: What form does the API take
>>
>> a) Document.innerHTML
>>
>> b) document.parse()
>>
>> c) document.createDocumentFragment()
>
> I prefer b) because:
>  * It doesn't involve creating the fragment as a separate step.
>  * It doesn't need to be foolishly consistent with the HTML vs. XML
> design errors of innerHTML.
>  * It's shorted than document.createDocumentFragment().
>  * Unlike innerHTML, it is a method, so we can add more arguments
> later (or right away) to refine its behavior.

Possibly a second argument to force parsing in a particular way when
it's ambiguous would be useful.

~TJ

Received on Tuesday, 15 May 2012 14:43:36 UTC