- From: Geoffrey Sneddon <gsneddon@opera.com>
- Date: Wed, 18 Nov 2009 17:41:26 +0100
- To: Henri Sivonen <hsivonen@iki.fi>
- CC: Ian Hickson <ian@hixie.ch>, public-html@w3.org, pjt47@cam.ac.uk
Henri Sivonen wrote:
> On Nov 13, 2009, at 14:15, Geoffrey Sneddon wrote:
>
>> Henri Sivonen wrote:
>>> On Nov 13, 2009, at 12:06, Geoffrey Sneddon wrote:
>>>> However, I think that such implementations are probably more important in terms of the structure of the DOM created (because they are more likely to support scripting), and as such it seems silly to have anything apart from a single text node in all cases, especially when such implementations can likely have a single text node backed by multiple strings internally.
>>> It's not necessarily silly not to require browsers to coalesce in all cases. Would you make parser-inserted text nodes coalesce into script-created text nodes or parser-created older-than-previous text nodes that a script has moved around?
>> No, but I would expect the parser (without executing any script) to always create a DOM with no adjacent text nodes. If you start manually manipulating the DOM via scripting I'd expect to end up with the DOM I created (e.g., if I appendChild a text node I would expect a text node to be appended, I wouldn't expect, ever, to get a single text node if there was already a text node as the last child).
>
> That wasn't quite the case I was asking about. I concretely, I was asking about the following (illustrated here as document.write but I'm also asking about the case where the document.write boundaries are network buffer boundaries instead):
> document.write("<div id=thediv>");
> document.getElementById('thediv').appendChild(document.createTextNode("foo"));
> document.write("bar");
>
> One text node with data "foobar" or two text nodes: "foo" followed by "bar"? Does it matter?
I would expect that to create a single text node. I would intuitively
expect the parser to append if the last child is a text node. If you
reversed the last two lines I'd expect to get two.
> document.write("<div id=thediv>");
> document.write("foo");
> document.write("bar");
>
> One text node with data "foobar" or two text nodes: "foo" followed by "bar"? Does it matter?
One text node. I think making document.write behave differently than
just inserting characters at the current position of the input stream is
a bad idea, _especially_ if you treat network buffer boundaries the same
way (as they /surely/ should have no effect on parsing).
I think for the does it matter question for both these two examples it's
a question of whether we want to make another thing non-deterministic in
the parser (the only thing currently is aborting parsing on a parse error).
Another interesting example:
document.write("<div id=thediv>1");
document.getElementById('thediv').appendChild(document.createTextNode("2"));
document.write("3");
Current behaviour here is interesting:
Firefox gives two text nodes, the first containing "13" and the second
containing "2" (this seems counter-intuitive, so I hope we can all agree
that this shouldn't be done).
WebKit gives two text nodes also, the first containing "2" and the
second containing "13" (i.e., the same as Firefox but in the opposite
order; this too seems counter-intuitive).
Opera gives three text nodes: "1", "2", and "3". This is the only
behaviour of the three browsers that seems at all sensible.
I think the behaviour of Firefox and WebKit in this case shows the
danger of not coalescing text nodes in the document.write and/or
scripting case: you can very easily end up with quite weird behaviour. I
think the easiest way to try and avoid implementations introducing such
bugs is to just require coalescing in all cases from the parser.
> In foster-parenting cases, that's not enough. Consider: <table><tr>f<td>c</td>f
>
> Here, when the second 'f' is foster-parented, the cell content 'c' is the text node the parser inserted last. Now, if foster-parenting examines the DOM to see if the foster parent already has a text node previous sibling (in order to merely extend that text node), the previous sibling could be script-created.
I don't think that's a real problem (that the previous-sibling could be
script created).
> Does specifying whether foster-parented text coalesces or not really matter for interop? (I believe coalescing all non-foster-parented parser-inserted text does matter for interop.)
For interop? Yes. For web compat? No. It seems a silly thing to have as
a single implementation specific detail (leaving out error handling for
now).
> Is it really bad for the parser to extend script-created text nodes?
No. I think, as I said above, that always coalescing is a good idea to
avoid weird bugs. I also think that even coalescing in such cases
shouldn't be gratuitously expensive overall.
--
Geoffrey Sneddon — Opera Software
<http://gsnedders.com/>
<http://www.opera.com/>
Received on Wednesday, 18 November 2009 16:42:20 UTC