- From: Ian Hickson <ian@hixie.ch>
- Date: Wed, 17 Mar 2010 23:22:45 +0000 (UTC)
- To: Anton Muhin <antonm@chromium.org>, Boris Zbarsky <bzbarsky@MIT.EDU>, Maciej Stachowiak <mjs@apple.com>
- Cc: public-webapps@w3.org
- Message-ID: <Pine.LNX.4.64.1003172314250.13402@ps20323.dreamhostps.com>
On Sat, 13 Feb 2010, Boris Zbarsky wrote:
> On 2/13/10 6:18 AM, Ian Hickson wrote:
> > I'm concerned about the GC-sensitivity of such behaviour (we might end
> > up snookering ourselves in a situation where specific GC behaviour
> > actually matters for compatibility).
>
> We haven't gotten there yet, in 8 years of two different implementations
> having this behavior....
>
> > How about the following compromise: these methods return a new object
> > each time, except if they are called with the same argument as the
> > previous invocation of the method? i.e. cache the last returned object
> > only. Would that be acceptable?
>
> As far as I can tell, no. That formulation makes it impossible to drop
> the cache as needed, effectively leading to a memory leak if no one
> calls these methods for a while.
>
> > Alternatively, if we need to cache more than that, how about blowing
> > away the cache with each spin of the event loop, so that anything in a
> > tight loop is cached (and _not_ subject to GC — this could be a
> > problem if the script calls one of these methods with 10000 different
> > arguments and sets properties on each one)
>
> Indeed. Especially because these objects can end up taking up a fair
> amount of memory...
>
> > but not beyond one task? (i.e. don't share objects in calls across
> > setTimeout)
>
> Those two are not quite the same condition. Implementing the latter in
> Gecko would not be too bad. Implementing the former, assuming I
> understand what you mean correctly, would require totally rewriting the
> Gecko event loop.
On Sat, 13 Feb 2010, Maciej Stachowiak wrote:
> On Feb 13, 2010, at 3:18 AM, Ian Hickson wrote:
> >
> > I'm concerned about the GC-sensitivity of such behaviour (we might end
> > up snookering ourselves in a situation where specific GC behaviour
> > actually matters for compatibility).
>
> It's not GC that matters but the degree of caching (e.g. whether cache
> items are ever cleared for reasons other than GC). It's true that this
> is theoretically a hazard, but the only observable effect would be
> whether custom properties set on one NodeList appear on one retrieved
> later. Since it's very uncommon (and indeed unlikely) for authors to set
> custom properties on NodeLists, I think this benefit is purely
> theoretical, not real.
>
>
> > How about the following compromise: these methods return a new object
> > each time, except if they are called with the same argument as the
> > previous invocation of the method? i.e. cache the last returned object
> > only. Would that be acceptable? It gives you a performance win in the
> > case where the author spins a loop using the same call over and over,
> > and is completely predictable.
>
> It's only predictable if that last object is kept alive, even if it were
> otherwise a candidate for garbage collection. Are you suggesting to do
> that? I assume so, because that's the only way it would be "completely
> predictable". If so, then I would object, because it could lead to a
> large long-term memory cost (fully traversing a large NodeList in a loop
> would leave you paying the cost of that memory until you leave the page
> or the author fetches a different NodeList). Imagine the last NodeList
> you accessed was the result of getElementsByTagName("*") and the author
> fully traversed it. Now you've likely pinned memory proportional to the
> size of the DOM.
>
> Even without the memory issue, I would not favor this design, because it
> makes performance fall off a cliff if you use more than one NodeList.
> Changing your loop from fetching one NodeList to two could suddenly make
> it 50x slower. We do not like coding performance hazards like this into
> our implementation.
>
>
> > Alternatively, if we need to cache more than that, how about blowing
> > away the cache with each spin of the event loop, so that anything in a
> > tight loop is cached (and _not_ subject to GC — this could be a
> > problem if the script calls one of these methods with 10000 different
> > arguments and sets properties on each one), but not beyond one task?
> > (i.e. don't share objects in calls across setTimeout)
>
> Pinning a potentially unbounded number of NodeLists in memory would
> definitely be unacceptable from both speed and memory perspectives.
> Especially on mobile devices.
>
>
> I note that if all you care about is ensuring that behavior is
> deterministic, the simplest solution would be to make NodeList objects
> disallow setting of custom properties. Then there is no way to observe
> the side effects of GC behavior. This would be simpler to implement than
> either of your proposed rules, and would not create speed or memory
> hazards. I do not know if we could justify such a change as a mere
> erratum to DOM Level 3 Core, but the same goes for both your proposed
> policies.
>
>
> > For now for the objects in HTML5 I've gone with the first of these
> > suggested compromises.
>
> I don't think we'd be willing to implement that in WebKit. We're more
> likely to copy existing Firefox and IE behavior.
On Sat, 13 Feb 2010, Anton Muhin wrote:
>
> For me performance-wise both approaches seem fine, but to get numbers I
> need to run an experiment.
>
> My main concern would be that rules are overcomplicated imho. And I
> won't be surprised if IE and FF would just ignore them. But you
> understand all those matters better than I do.
>
> BTW, as the proposed change would reduce stress on GC, I don't quite see
> how this could reduce compatibility (but life is often more complicated
> than I can imagine.)
I've changed the HTML5 spec to allow arbitrary caching for the various
objects in HTML5 instead.
--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Wednesday, 17 March 2010 23:23:14 UTC