Re: Implementing NodeList from Stephen R. Savitzky on 1998-07-28 (www-dom@w3.org from July to September 1998)

From: Stephen R. Savitzky <steve@crc.ricoh.com>
Date: 27 Jul 1998 17:52:10 -0700
To: Mike Champion <mcc@arbortext.com>
Cc: keshlam@us.ibm.com, www-dom@w3.org
Message-ID: <qcn29uajb9.fsf@gelion.crc.ricoh.com>

Mike Champion <mcc@arbortext.com> writes:

> >This means that every call on item() involves a complete traversal of half
> >the tree (on average) to make sure that a new node of the type selected by
> >getElementsByTagName hasn't snuck in behind your back.  In the case of large
> >documents, possibly containing references to external parsed entities, this
> >could take a *very long time*.
> 
> Not in practice; I assure you that I screamed and yelled about this when
> iterator were removed from the spec in favor of NodeLists.  I got a fair
> amount of free advice to the effect that IN GENERAL people will traverse
> node lists in order, so implementation tricks involving cacheing the
> current location and validating that nothing has changed before accessing
> the "next" index in the NodeList can mean that real performance is much
> better than the worst case.

I think that ``in practice'' depends on who is practicing what; in my
opinion the advice you got was worth exactly what you paid for it.  In a
multithreaded environment (which we have in our application) with very large
documents (such as a database or document store) this kind of implementation
trick is unacceptable.  Using ``live'' NodeLists is simply impossible in most
cases -- and in fact the spec does not appear to require it in any case
except for the value of childNodes, where it makes sense anyway and isn't
particularly expensive. 

I'll stand by my original statement.  For example, consider a document
derived in the obvious fashion from a table in a relational database.  A
node with several million children is not unreasonable to expect.  If the
document is ``live'', the implementor might have to query the database and
obtain the entire table at each iteration.

It might not be obvious now, but people are going to be using the DOM for a
lot of things besides hacking flashy effects into a document in Netscape.

> Nevertheless, the real answer -- as I think you said last time -- is to put
> interators back in the spec for Level 2.

You're right there, but even iterators don't help if they are required to be
``live'' and constantly reflecting the current structure of the tree.
Sometimes that interpretation is the correct one, sometimes it isn't, and
either the determination should be left to the implementor, or two different
subclasses of iterators should be specified.

-- 
 Stephen R. Savitzky   Chief Software Scientist, Ricoh Silicon Valley, Inc., 
<steve@rsv.ricoh.com>                            California Research Center
 voice: 650.496.5710   fax: 650.854.8740    URL: http://rsv.ricoh.com/~steve/
  home: <steve@starport.com> URL: http://www.starport.com/people/steve/

Received on Monday, 27 July 1998 20:47:44 UTC