Re: Traverse and Node types

    Hello,

    Currently, in the way the DOM, as well as TreeWalker and NodeIterator's
are concerned, only nodes directly in the hierarchy of a DOM tree are to be
returned. As you specified, there are three node types that do not live within
the context of the DOM tree's hierarchical structure  (Attributes, Notations,
and Entities).
    I'm not ceretian what benefit there could possibly be in having a
TreeWalker or NodeIterator returning to you those node types. In actuality, it
shouldn't give you those nodes at all, and I'll explain to you why.
    As the hierarchy of the DOM tree is created, the parent/child/sibling
relationships are established between Element/Text/CData/etc. nodes. The
Attribute nodes simply esist in "Attribute space" where no ordering (or any
semantics at all) are defined. When an Element node is created, and it
contains Attributes, a simple relationship is established between those nodes,
but there still is no notion of Hierarchy.
    Thus, when using a NodeWalker or a NodeIterator, every time you get a Node
who's type is ELEMENT_NODE, you should call getAttributes(). That should be
the only way Attributes are accessed.
    As for Entity and Notation nodes, those are simply there for reference if
you come upon entity attributes, and need to resolve an unparsed external
entity. They need not be in order, because the DOM does not define the Node
types for everything else that can be defined in a DTD, thus there is no way
you can preserve the syntax and semantics of what was declared in that DTD.
    As an observation, I can say that (In the XML parsers that I have used)
Attributes within a NodeList are in the same order that they are defined
within an instance. Also, fixed and defaulted attributes are put in the place
where they would be defined in the instance.

    (eg)
    <!ELEMENT a EMPTY>
    <!ATTLIST a
        foo         CDATA    #FIXED "bar"
        foo2       CDATA    #REQUIRED
        foo3       (bar | baz)  'baz'>

    Here's an instance:
        <a foo2="text"/>

With in the Attribute NodeList defined for ths element, it would (at least in
most instances) be ordered like this:
    (foo), (foo2), (foo3)

Hope that helps.

Eric Lawson
Isogen International

Martijn Pieters wrote:

> Hi,
>
> I am trying to find out what Node types are actually iterated over when
> using a NodeIterator or TreeWalker on a DOM tree.
>
> The NodeFilter Interface lists all possible Node types to filter out with
> the whatToShow attribute of the TreeWalker and NodeIterator Interfaces.
> >From this (and other references), one can extrapolate that both these
> Interfaces will normally visit *all* possible Node types in a DOM tree.
>
> However, a naive implementation of TreeWalker or NodeIterator will use
> only the .childNodes, .firstChild, .lastChild, .nextSibling,
> .previousSibling and .parentNode attributes of the Node interface. Using
> these methods, no Attr, Entity or Notation Nodes will ever be encountered,
> nor will any of the Nodes making up the sub-trees of Attr and Entity
> Nodes.
>
> If however an implementation wants to include these 'lost' Node types by
> including DocumentType.entities and .notations, and Node.attributes,
> another problem arises; that of ordering. The Traversal interface
> specifies that the .nextNode() and .previousNode() methods iterate over
> Nodes in document order.
>
> But what is document order for the .attributes Nodes of an Element? The
> NamedNodeMap doesn't preserve ordering. And DocumentType.entities and
> .notations are not only NamedNodeMaps, any document ordering as these
> Nodes are not modeling the actual definitions in the DTD. Should we first
> iterate .entities or .notations?
>
> For .attributes and .childNodes it is clearer, attributes come before
> child Nodes in document order, so TreeWalker.firstChild will then take an
> Attribute Node first if there are any available.
>
> Can anyone shed some light on this issue? Should Attr, Entity and Notation
> Nodes be included in DOM tree traversals? And if so, how are .attributes,
> .entities, .notations and .childNodes collections ordered?
>
> Whatever the answer, I think the Traversal specification could do with an
> explicit explanation on this issue.
>
> --
> Martijn Pieters
> | Software Engineer  mailto:mj@digicool.com
> | Digital Creations  http://www.digicool.com/
> | Creators of Zope   http://www.zope.org/
> ---------------------------------------------

Received on Saturday, 24 February 2001 10:26:40 UTC