Re: Black-box equivalence of parsing fragments directly into context node from Jonas Sicking on 2008-11-26 (public-html@w3.org from November 2008)

From: Jonas Sicking <jonas@sicking.cc>
Date: Wed, 26 Nov 2008 14:54:55 -0800
To: Henri Sivonen <hsivonen@iki.fi>
CC: HTML WG <public-html@w3.org>
Message-ID: <492DD3BF.8050909@sicking.cc>
Henri Sivonen wrote:
> On Nov 26, 2008, at 23:28, Jonas Sicking wrote:
> 
>> Henri Sivonen wrote:
>>> I'm considering implementing HTML5 innerHTML setting in Gecko by 
>>> using the owner document of the context node as the document seen by 
>>> the parser and by sticking the context node as the first node on the 
>>> stack (but masking its name to show "html" to the tree builder in 
>>> order to avoid breaking the fragment algorithm assertions) and by 
>>> then running the fragment parsing algorithm without returning to the 
>>> event loop until done. The context node would be in the tree for the 
>>> entire time. I'd deflect attempts to add more attributes to the root 
>>> node upon stray <html> tag.
>>> Is there a reason why the spec doesn't prescribe this? Why does the 
>>> spec specify parsing into another document first and then moving the 
>>> nodes over? Is what I described above not black-box equivalent to the 
>>> steps that the spec prescribes?
>>
>> What do you mean by "masking its name to show 'html'"?
> 
> I meant creating a parser-internal stack node object that wraps the real 
> context node for passing to concrete tree builder methods but shows the 
> local name for "html" for abstract tree builder-internal comparisons. 
> However, I now see that my fragment case compares don't compare the 
> local name anyway (even though the spec talks about the "html" node) and 
> instead compare for stack position 0, so masking the name is moot.
> 
>> Why couldn't the spec instead say to use the ownerDocument of the 
>> context node (like Henri is suggesting) and parse into a 
>> documentFragment node? I.e. why do we need the new Document node and 
>> the new <html> node?
> 
> Why is there even a need for parsing into a document fragment? Would 
> mutation events or something of that nature go wrong if parsing directly 
> into the context node?

If we ensure that no events fire during the parsing then parsing 
directly into the context node should be fine.

Mutation events are always a pain (and i'm very seriously considering 
dropping support for them in gecko) but largely undefined. So we can 
simply stay silent on them for now, or put in something informative, but 
talk with the webapps wg so that when they define them ensure that they 
fire after all parsing is done and all nodes are inserted. I.e. setting 
innerHTML should be considered a 'compound operation'.

Another thing that can cause script to execute is, of course, <script> 
elements. However due to legacy reasons <script> elements do not execute 
when inserted using .innerHTML.

I'm not sure if there are other things that can cause events to fire. 
For example, does 'change' events fire when parsing a <select> with 
multiple options selected? If so we'd need to define that such events 
don't fire until after the innerHTML setting is fully done.

> I did notice Boris' points about <base> and the form pointer in 
> mozilla.dev.platform. However, wouldn't it be feasible to set the form 
> pointer to the nearest form parent of the context node

Does anything need to be set at all? Normally during DOM operations 
form-controls are only associated with ancestor <form> elements. Why 
wouldn't the same thing be done here?

What exactly is the cases we're trying to address? I guess something like:

<form id=outer>
   <div id=target></div>
</form>

and someone setting
target.innerHTML="<table><tr><td><form id='inner'><input id='c1'>" +
                  "</table><input id='c2'>"

Which form should the two <input>s belong to.

Actually, in this case it might make a difference if the nodes are first 
parsed into a separate node and then moved to the context node, or if 
they are parsed directly into the context node. If parsing directly into 
the context node 'c2' will probably be associated with 'inner'. However 
I'm not sure if that is still the case if you then move the whole 
fragment out of its parent and insert it into 'target'.

> and not process 
> <base> in the fragment mode?

Why would you do anything special for <base> at all? I'd think that if 
setting .innerHTML results in one or more <base> elements getting 
inserted, you'd just act the same way as if they had been inserted using 
.appendChild.

> Presumably, XLink autoloads and the load 
> event for SVG fragments would have to be suppressed, but that's not 
> worse than having to mark scripts as already executed.

Yeah. IMHO both those features should be removed from the web platform.

/ Jonas
Received on Wednesday, 26 November 2008 22:53:41 UTC