[whatwg] [webcomponents] Template element parser changes => Proposal for adding DocumentFragment.innerHTML from Ian Hickson on 2012-05-04 (public-whatwg-archive@w3.org from May 2012)

From: Ian Hickson <ian@hixie.ch>
Date: Fri, 4 May 2012 22:26:07 +0000 (UTC)
Message-ID: <Pine.LNX.4.64.1205042214350.9926@ps20323.dreamhostps.com>

On Fri, 4 May 2012, Rafael Weinstein wrote:
> On Fri, May 4, 2012 at 2:46 PM, Ian Hickson <ian at hixie.ch> wrote:
> > On Fri, 4 May 2012, Rafael Weinstein wrote:
> >>
> >> This is the current proposal:
> >>
> >> http://lists.w3.org/Archives/Public/public-webapps/2012AprJun/0334.html
> >
> > I don't really understand the proposal.
> >
> > How does it relate to the template feature?
> 
> The contents of <template> need to parse context-free (or implied 
> context, or whatever). This adds the notion to HTML parsing so that 
> <template> can use it.
> 
> e.g. <template><tr><td>Foo</td></tr></template>

I don't understand how this would work in the parser. The parser doesn't 
have a "context element" concept, that's only for fragment parsing. If you 
reset the insertion mode in the parser, it uses the stack of open 
elements, which would always be a <template> element in this case when 
you parse the <tr>.

> > What does it do in the case of:
> >
> >   var frag = document.createDocumentFragment();
> >   frag.innerHTML = 'bla bla .. 1GB of text .. bla <caption> bla' ?
> 
> Queue up pending tokens until you see the first start tag token or the
> end of file. The webkit implementation is here:
> 
> https://bugs.webkit.org/attachment.cgi?id=140125&action=review

So:

   frag.innerHTML = 'bla bla .. 1GB of text .. bla <caption> bla';

...results in a document fragment with one node containing " bla", while:

   frag.innerHTML = 'bla bla .. 1GB of text .. bla <caqtion> bla';

...results in a document fragment with a 1GB text node, an unknown element 
<caqtion>, and another text node?

That seems pretty weird.

> > Why do we imply a tbody if the input is "<tr></tr><div></div>"?
> 
> Because there's nothing better to do.

I think almost anything else would be better. :-)

In particular, I think having the output be a <tr> element and <div> 
element as siblings would be better, as would having the output be just a 
<tr> element or just a <div> element.

> > Since you need the context element to know how to initialise the 
> > tokeniser, how do you find the first tag?
> 
> You always start in the DATA state. Can you think of a case where this 
> won't work?

You describe the change as a "mere addition", but it sounds much more 
invasive than that if you're going to assume a context element and then 
change it later.

It sounds like what you're really proposing is not to change the context 
element but to have the parser start off in some new mode where we just 
wait for the first open tag, and then we do some substitution to get a 
surrogate node, and try to reset based on that surrogate node's name 
instead of the stack of open elements.

That seems pretty weird to me, but certainly isn't the weirdest thing 
that's been proposed.

Do we have a page or e-mail somewhere that documents all the cases we're 
trying to support?

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Friday, 4 May 2012 15:26:07 UTC