W3C home > Mailing lists > Public > public-html@w3.org > November 2009

Re: Tree construction: Coalescing text nodes

From: Henri Sivonen <hsivonen@iki.fi>
Date: Wed, 18 Nov 2009 11:11:00 +0200
Cc: Geoffrey Sneddon <gsneddon@opera.com>, Ian Hickson <ian@hixie.ch>, public-html@w3.org, pjt47@cam.ac.uk
Message-Id: <382F348D-02E0-4C91-BA73-33B69C974EE8@iki.fi>
To: Jonas Sicking <jonas@sicking.cc>
On Nov 18, 2009, at 04:16, Jonas Sicking wrote:

> On Tue, Nov 17, 2009 at 8:20 AM, Henri Sivonen <hsivonen@iki.fi> wrote:
>> I just realized I had written a bug that relates to coalescing foster-parented text.
>> Consider: document.write("<table>  ");
>> By the time the document.write() returns, it's impossible to decide if the spaces aren't going to get non-space characters in the same text node (and get foster-parented) or whether there's not going to be non-space characters (in which case the spaces shouldn't get foster-parented).
>> I suggest not flushing the spaces when the document.write() returns and only flushing them lazily--that is, making document.write() flush trailing text only if the element on the stack isn't foster-parenting.
> It seems more consistent to always flush. The only downside is that
> some whitespace might get "misplaced" into a table, but this does not
> seem like a big deal. And not flushing non-whitespace characters in
> the case of  document.write("<table>foo") seems very unexpected.

You are right. It's better to always flush (both when document.write returns and when the parser moves over the boundary of document.written data back into network-originated data). (After I posted, I realized that not flushing would complicate the move back to network-originated data.)

After all, as far as foster-parenting goes, flashing eagerly is equivalent to having a comment at the end of each document.written buffer (except, of course, there's no comment node to be seen in the DOM).

Currently, the spec doesn't have the concept of the parser being aware when it crosses from document.written data back to network-originated data. Perhaps it should have this concept. It would also be useful for defining that meta charset is ignored if its '>' didn't originate from the network.

Henri Sivonen
Received on Wednesday, 18 November 2009 09:11:36 UTC

This archive was generated by hypermail 2.4.0 : Saturday, 9 October 2021 18:45:03 UTC