- From: Jonas Sicking <jonas@sicking.cc>
- Date: Thu, 19 Nov 2009 00:59:13 -0800
- To: Geoffrey Sneddon <gsneddon@opera.com>
- Cc: Henri Sivonen <hsivonen@iki.fi>, Ian Hickson <ian@hixie.ch>, public-html@w3.org, pjt47@cam.ac.uk
On Wed, Nov 18, 2009 at 8:41 AM, Geoffrey Sneddon <gsneddon@opera.com> wrote: > Henri Sivonen wrote: >> >> On Nov 13, 2009, at 14:15, Geoffrey Sneddon wrote: >> >>> Henri Sivonen wrote: >>>> >>>> On Nov 13, 2009, at 12:06, Geoffrey Sneddon wrote: >>>>> >>>>> However, I think that such implementations are probably more important >>>>> in terms of the structure of the DOM created (because they are more likely >>>>> to support scripting), and as such it seems silly to have anything apart >>>>> from a single text node in all cases, especially when such implementations >>>>> can likely have a single text node backed by multiple strings internally. >>>> >>>> It's not necessarily silly not to require browsers to coalesce in all >>>> cases. Would you make parser-inserted text nodes coalesce into >>>> script-created text nodes or parser-created older-than-previous text nodes >>>> that a script has moved around? >>> >>> No, but I would expect the parser (without executing any script) to >>> always create a DOM with no adjacent text nodes. If you start manually >>> manipulating the DOM via scripting I'd expect to end up with the DOM I >>> created (e.g., if I appendChild a text node I would expect a text node to be >>> appended, I wouldn't expect, ever, to get a single text node if there was >>> already a text node as the last child). >> >> That wasn't quite the case I was asking about. I concretely, I was asking >> about the following (illustrated here as document.write but I'm also asking >> about the case where the document.write boundaries are network buffer >> boundaries instead): >> document.write("<div id=thediv>"); >> >> document.getElementById('thediv').appendChild(document.createTextNode("foo")); >> document.write("bar"); >> >> One text node with data "foobar" or two text nodes: "foo" followed by >> "bar"? Does it matter? > > I would expect that to create a single text node. I would intuitively expect > the parser to append if the last child is a text node. If you reversed the > last two lines I'd expect to get two. I think this adds unnecessary complexity and performance cost to the parser. The intuitive (to me) implementation is for the parser to keep a reference to the last textnode it has inserted. Whenever more text data is parsed, append the text to that textnode. Whenever non-text data is parsed, drop the reference to the textnode. Whenever text data comes in and the parser doesn't hold a reference to the textnode, create a new textnode and append to the end of the current insertion container. The only case where this breaks down is if someone mixes DOM insertion with document.write or network-parser inserted content. Mixing document.write with DOM insertions seems like a very odd coding pattern to me, so I don't see a reason to optimize for it. Mixing network-parser and DOM insertion is inheritely racy, so here it's arguably even beneficial if parser-created nodes and DOM created nodes are never merged. >> document.write("<div id=thediv>"); >> document.write("foo"); >> document.write("bar"); >> >> One text node with data "foobar" or two text nodes: "foo" followed by >> "bar"? Does it matter? > > One text node. I agree. Though note that after the second call to document.write the textnode must exist in the DOM. > Another interesting example: > > document.write("<div id=thediv>1"); > document.getElementById('thediv').appendChild(document.createTextNode("2")); > document.write("3"); > > Current behaviour here is interesting: > > Firefox gives two text nodes, the first containing "13" and the second > containing "2" (this seems counter-intuitive, so I hope we can all agree > that this shouldn't be done). I disagree, see above. > WebKit gives two text nodes also, the first containing "2" and the second > containing "13" (i.e., the same as Firefox but in the opposite order; this > too seems counter-intuitive). The reverse order here is surprising to me. Is there a textnode for the "1" in the DOM after the first document.write? >> In foster-parenting cases, that's not enough. Consider: >> <table><tr>f<td>c</td>f >> >> Here, when the second 'f' is foster-parented, the cell content 'c' is the >> text node the parser inserted last. Now, if foster-parenting examines the >> DOM to see if the foster parent already has a text node previous sibling (in >> order to merely extend that text node), the previous sibling could be >> script-created. Or even worse: document.write("<table> "); document.write("x"); document.write("<td></table>"); After the first document.write, is there a textnode in the DOM for the whitespace? Where does it appear? Is it foster-parented or not? After the second document.write, where is the "x" inserted? In its own textnode or somehow coalesced with the whitespace node? / Jonas
Received on Thursday, 19 November 2009 09:00:10 UTC