- From: Jonas Sicking <jonas@sicking.cc>
- Date: Wed, 03 Dec 2008 17:11:52 -0800
- To: Ian Hickson <ian@hixie.ch>
- CC: Henri Sivonen <hsivonen@iki.fi>, Boris Zbarsky <bzbarsky@MIT.EDU>, HTML WG <public-html@w3.org>
Ian Hickson wrote: >> Is what I described above not black-box equivalent to the steps that the >> spec prescribes? > > I believe it is, though I wouldn't guarantee it. > > > On Wed, 26 Nov 2008, Jonas Sicking wrote: >> Why couldn't the spec instead say to use the ownerDocument of the >> context node (like Henri is suggesting) and parse into a >> documentFragment node? I.e. why do we need the new Document node and the >> new <html> node? > > I guess we could do that, but what would it gain us? Implementations are > free to optimise this anyway. See your answer to the previous question :) I.e. while it is possible to come up with something that is performant, ensuring that it is guaranteed to be exactly a black-box equivalent to the spec is hard. It is also quite possible that there are unintended edgecases that would need either "unnecessary" extra code, or cost unindented perf hits just to ensure that it is a black-box equivalent of the spec algorithm. > On Wed, 26 Nov 2008, Henri Sivonen wrote: >> Why is there even a need for parsing into a document fragment? Would >> mutation events or something of that nature go wrong if parsing directly >> into the context node? > > We'd quench those anyway. > >> I did notice Boris' points about <base> and the form pointer in >> mozilla.dev.platform. However, wouldn't it be feasible to set the form >> pointer to the nearest form parent of the context node and not process >> <base> in the fragment mode? Presumably, XLink autoloads and the load >> event for SVG fragments would have to be suppressed, but that's not >> worse than having to mark scripts as already executed. > > I think it would be possible; the question is would the benefit outweigh > the cost. I'm not sure it would, from the spec's point of view. It's a lot > easier to reason about what the spec means if it is clearly a separate > document -- none of your questions come up, for example. How so? XLink autoloads could be interpreted as replacing the separate doc and then that new doc is what is inserted. And onload events need to be defined if/when they parse anyway. For example, if they are defined to be firing while the new DOM is in a separate doc, then we would in fact be forced to parse into a separate doc since that is the DOM that such event handlers would see. I.e. if I have something like foo.innerHTML = "<svg onload='alert(document.getElementsByTagName(\'*\')'/>" > On Wed, 26 Nov 2008, Boris Zbarsky wrote: >> From a spec point of view the only obvious issue I see here is that the >> mutation event behavior means the parser needs to take pains to produce >> the same results as would be produced by the currently-specified >> algorithm even in cases when mutation events rearrange the DOM. > > Surely we don't want any mutation events firing during innerHTML. As others have pointed out, there are currently pages that depend on mutation events firing when the new fragment is inserted. Note that when I said that I'm pondering removing support for mutation events entirely, that is going to be a page-breaking change. It's something that would have to be rolled out over time, and not until we have a good replacement for them. >> <form id=outer> >> <div id=target></div> >> </form> >> >> and someone setting >> target.innerHTML="<table><tr><td><form id='inner'><input id='c1'>" + >> "</table><input id='c2'>" >> >> Which form should the two <input>s belong to. > > The inner one, per spec, I believe. That is not what the current spec produces though. When the innerHTML is first parsed, c2 is associated with with the inner form. However when the nodes are then moved out of the temporary document the form owner on c2 is reset to null. When the element is then inserted into the new document the form owner is again reset, this time to the outer form. This would not be the case if the innerHTML markup is parsed directly into the context node. For what it's worth, I tried the above example in a few browsers: Firefox doesn't create the inner <form> at all. The firefox parser always ignores a <form> tag inside another form, and since we build the whole ancestor stack when setting up the context to parse innerHTML this applies here too. So both <input>s are associated with the outer form. IE throws an exception when trying to set innerHTML. It seems to do so any time you set innerHTML on an element that is inside a <form>, and the innerHTML string contains a <form>. Opera and Safari both associate c1 with the inner form and c2 with the outer. Possibly due to parsing into a separate document or fragment and then re-associating c2 when moving it from the document/fragment to the main DOM. >> I think the document.write()-safe points need to be enumerated. In the >> other cases (which hopefully form an empty set), document.write() should >> be a no-op. That is, I think the spec should either specifically make >> the load event for <svg> a safe point for document.write() or it should >> make document.write() a no-op if executed at that point. The fewer these >> document.write()-safe points are, the better. > > I don't understand what you mean by "safe point". If you call > document.write() from <svg>, then you'll blow away the document, since the > insertion point won't have been defined. Note that this is not how things work in current browsers. Calling document.write from events etc will append to the current document as long as we're not past the point of having parsed the whole network stream. If we make any and all document.writes that happen outside of a <script> replace the existing document then I would expect pages to break in a very severe way (i.e. the whole page disappears). A much safer strategy would be to make document.writes that happen before we've reached the end of the network stream, but without there being an explicit insertion point, be a no-op. / Jonas
Received on Thursday, 4 December 2008 01:12:34 UTC