Re: Walking the DOM (was: XML APIs)

John Cowan <cowan@locke.ccil.org> writes:

> Actually I *was* thinking of invalidating all iterators, on the assumption
> that the number of iterators is typically small, even if we can't
> (due to garbage-collection considerations) determine its actual value.

The limiting case is when you are using an iterator to traverse a tree
making some structural modification.  Again, this reduces to an O(log N)
test at each iteration, as the iterator verifies that the part of the tree
it is in has not been changed.  (By `invalidate' I assume you mean
`discovering that a test needs to be performed to determine whether an
exception needs to be thrown'.  If you mean that iterators throw an
exception whenever a change is made to the tree, that implies that iterators
cannot be used for performing structural modifications AT ALL.  All this
does is guarantee that anyone who _does_ want to iterate through a tree and
delete all the tables (for example) will have to implement their own
iterators.)

> For Level 1 purposes, a NodeList doesn't actually get invalidated
> from the user's viewpoint; it just has to rescan the tree from the origin.

That's what I meant by `invalidated' -- its old cache has become invalid and
has to be refreshed.

> > Determining the desired behavior when a subtree that contains an iterator is
> > _moved_ is even more difficult.  One can make a good case either way.
> 
> I think it's simplest to treat moves as remove + insert.

That doesn't say what is supposed to happen to an iterator that was pointing
at a node in the subtree that got moved.  Should it:

 o throw an exception? (One way to do this is to treat moves as remove +
   copy instead of remove + insert.  Another is for the iterator to include
   a snapshot (stack) of the current node's ancestry.  The third, of course,
   is to say that iterators simply can't operate on a tree that is being
   modified, but I believe this to be undesirable.)

or

 o continue blythely on from their new position.

There is a common presumption on the part of most programmers that obtaining
the next item from an array or iterator is O(1).  One of the reasons for
having iterators, in fact, is to make this feasible in cases where the
interface doesn't give enough information for an O(1) iteration step.

Nodelists already violate this presumption, in spades.  I would hate to see
iterators violate it as well; that's just inviting clever programmers to
make their own private iterators that _do_ operate efficiently, while
forcing the less wary to suffer the consequences. 

I think it is possible to accomodate both kinds of programmers by explicitly
specifying all three kinds of iterators: efficient and straightforward,
careful (robust against deletion), and paranoid (throw exception on any
modification).  If a choice has to be made, though, I would opt for
efficient and straightforward on the grounds that those are the ones that
are going to be _used_ anyway, whether or not the DOM specifies them.

-- 
 Stephen R. Savitzky   Chief Software Scientist, Ricoh Silicon Valley, Inc., 
<steve@rsv.ricoh.com>                            California Research Center
 voice: 650.496.5710   fax: 650.854.8740    URL: http://rsv.ricoh.com/~steve/
  home: <steve@starport.com> URL: http://www.starport.com/people/steve/

Received on Friday, 6 November 1998 16:13:32 UTC