- From: Stephen R. Savitzky <steve@rsv.ricoh.com>
- Date: 23 Sep 1999 14:46:36 -0700
- To: www-dom@w3.org
Congratulations! After seeing the previous version I was very worried about NodeIterator and TreeWalker; the present implementation definitely does the right thing, defining them exactly the way an implementor would expect. Now for the criticisms: 1. The ownerElement method of Attr conflicts with the statement that "attributes are not part of the document tree". Unless I'm missing something, ownerElement is essentially the same as parentNode, and means that Attr's cannot be shared. 2. There doesn't seem to be any way to create an Iterator from a NodeList or NamedNodeMap. In this regard it is also unfortunate that NamedNodeMap is not an extension of NodeList. 3. Ideally, all methods that return NodeList should be deprecated and replaced by methods that return an equivalent Iterator. 4. A Document should be able to import a TreeWalker. 5. The semantics of nextSibling on TreeWalker are unclear if the next sibling exists but does not match the filter criteria. 6. There still doesn't seem to be any way to represent all of the information in a DTD. Among other things, shouldn't there be a "notation" attribute of "Attr"? I believe that not including all of the DTD information violates the claim that "... For XML, this is specified by the W3C XML Information Set [Infoset]. The DOM is simply an API to this information set", since the XML Information Set includes the complete DTD. And finally, one comment: 7. Consider the consequences if Iterator and TreeWalker added the following methods derived from attributes of the reference Node: unsigned short getNodeType() DOMString getNodeName() DOMString getNodeValue() NamedNodeMap getAttributes() NodeIterator getAttributeList() DOMString getNamespaceURI() DOMString getPrefix() DOMString getLocalName() It would then be possible to traverse and process a document of arbitrary size without ever instantiating the whole tree, and indeed without creating any Node objects at all! (Of course, calling getNode() on such an iterator would force creation of a subtree, at least.) This technique also neatly sidesteps all issues of structure sharing, liveness, lazy evaluation, and so on. In effect, the entire document tree is lazily traversed. It also has significant advantages in a system with limited memory (e.g. an embedded application), or in which large documents are accessed in document order over a network. My current document-processing application is based on a somewhat more restricted version of this kind of extended TreeWalker: it can only go forward (i.e. it lacks lastChild, lastSibling, previousSibling, and previousNode), and getNode is guaranteed only to return the equivalent of referenceNode.cloneNode(false). For example, my parser uses this interface. To avoid proliferating interfaces, it would be sufficient to extend the semantics of the backwards-going methods on TreeWalker to allow them to return null if the previous node was simply inaccessible rather than nonexistant. My application also includes a ``virtual tree constructor'' interface that allows arbitrary-sized documents to be generated. in document order, again without necessarily instantiating any nodes -- it simply has all of the node-construction methods of Document, plus methods to start and end the children of the node under construction. This interface is used both for tree construction and for output. Note that these two interfaces between them have exactly the same expressive power as the entire DOM (i.e. they can represent the same document information), but can be considerably simpler to implement. They can also be used in applications (limited memory, large documents, databases, streams (e.g. log files), networks, etc., for which the DOM is unsuited. Obviously I don't expect these interfaces to show up in DOM Level 2 (though I wouldn't object if they did!), or indeed in _any_ future level; they merely represent an alternative view of documents that works better for certain kinds of application. For the curious, the entire application mentioned above is available as open source at <http://RiSource.org/PIA/>. -- Stephen R. Savitzky <steve@rsv.ricoh.com> <http://rsv.ricoh.com/~steve/> Quote of the month: Death is nature's way of telling you to slow down. Chief Software Scientist, Ricoh Silicon Valley, Inc. Calif. Research Center voice: 650.496.5710 front desk: 650.496.5700 fax: 650.854.8740 home: <steve@theStarport.org> URL: http://theStarport.org/people/steve/
Received on Thursday, 23 September 1999 17:47:14 UTC