Re: An observation about "live" NodeLists from Ray Whitmer on 1998-10-19 (www-dom@w3.org from October to December 1998)

From: Ray Whitmer <ray@imall.com>
Date: Mon, 19 Oct 1998 10:45:36 -0600
To: www-dom@w3.org
Message-ID: <362B6CB0.4853D76@imall.com>
keshlam@us.ibm.com wrote:

> >Unfortunately code like this might well perform *very*
> >badly for some implementations.

I agree.

> Unfortunately, _any_ deletion is going to have severe performance impacts
> on some implementations of NodeList. Only the cleverest implementations are
> going to even attempt to keep track of precisely which subtrees have
> changed and minimize recalculation... and the work involved in tracking
> that is going to impact performance and/or memory requirements elsewhere in
> the implementation.

I don't disagree that some implementations will be quite bad.  I disagree about
the univeral claims in how bad the implementation has to be, because it is easy
to create a good implementation.

Any implementation which:

1.  Caches seen items in the NodeList in something indexable like an array.
2.  Has a simple hierarchical comparison function that given two nodes can
check their traversal order.

can do a simple binary search on the cached items of the nodelist and remove
any nodes that fall in the range being removed.

With most usage patterns, this will generally be of negligible performance
impact, and no extra memory at all.

On the other hand, any implementation which:

1.  Caches a single node with an index.
2.  Has a simple hierarchical comparison function that given two nodes can
check their traversal order.

can quickly identify removed nodes which require the index of the cached node
to be modified.  This is slightly less efficient, but again probably not
noticably so in most usage patterns, and still involves no extra memory
overhead at all.

Even cases where the current or single node is removed, it is not difficult to
store a nearby neighbor as the place where searching is to resume for the
removed index.  Thus the remove(list.item(0)) loop can be quite efficient.

In every case I can imagine, this performance will be better than computing a
static list.  And, the fix-up issues are not much greater than for an
iterator.  You would have to check the removed hierarchy range against every
active query iterator, just as you would against every active query nodelist,
and the checks are not that different.

The major performance advantage an iterator would present would be not allowing
direct random access -- matching a lowest common denominator implementation
model a little bit better so users wouldn't try to use it in random-access ways
that would be inefficient in implementations that do not cache everything they
have seen.

But the examples we have been discussing exercize the implementations mostly
serially...  The significant advantage of iterators is that they make it easier
for a user to hold a position within a mutating model.

Ray Whitmer
ray@imall.com
Received on Monday, 19 October 1998 12:47:11 UTC