Re: Subclassing DOM nodes from Joseph Kesselman/Watson/IBM on 2000-10-20 (www-dom@w3.org from October to December 2000)

From: Joseph Kesselman/Watson/IBM <keshlam@us.ibm.com>
Date: Fri, 20 Oct 2000 11:00:32 -0400
To: "Kevin J" <kevinjz@hotmail.com>
Cc: www-dom@w3.org
Message-ID: <OFE4501592.0F8A8470-ON8525697E.004FBA14@pok.ibm.com>

>Isn't a DOM node object in a tree unique?

It's easy to confuse the DOM -- which is only an API -- with the data
structures behind it, which may not be object-oriented at all. Many of the
currently available DOM implementations do store the data directly in their
Node objects, so folks tend to assume that others will also do so. That is
not necessarily the case.

A _node_ is unique, because it's defined as representing a specific
instance of a specific fragment of the XML document being manipulated. But
there is no promise in the DOM spec that there is a one-to-one mapping
between nodes and DOM Objects.

For example, there may be nodes in the document for which no
programing-language Object yet exists, if the DOM is being clever and
creating Objects only when they're actually needed. (For example,
EntityReference nodes in some DOMs don't actually copy their children from
the Entity definition until the first time the user tries to look inside
the reference.)

Conversely, in some DOM implementations there may be multiple Objects which
represent the same Node. As long as they behave properly -- altering the
node via one of them changes the node as seen by all the others  -- and as
long as there is _SOME_ way to ask "is this the same node as that one",
that can be a fully compliant implementation of the DOM.

We should have standardized how one asks that last question. We simply
forgot to do so. In most (but not all) DOM implementations, it happens to
be true that the language's == operator can be used for that purpose, but
the DOM does not promise this. This oversight will be corrected in DOM
Level 3, when we introduce isSameNode() and (possibly, if we can agree upon
a definition we like) getNodeKey().


(If you're a low-level-code hacker -- C or Assembler -- you might want to
think about this as a double-indirection problem.  In some DOMs, nodes are
really pointers to data stored elsewhere. Comparing the identity of the
pointers will not reliably tell you whether or not they point to the same
thing; you have to ask them what they're pointing to and compare that.
getNodeKey() asks that question; isSameNode() combines asking that question
and comparing the results.)


If it still doesn't make sense... maybe someone else should attempt an
explanation. I'm running out of different ways to describe the problem.


But this issue really  _does_ affect some other DOMs. Even if your
particular DOM allows you to use == and get the right result, isSameNode()
will be preferred for portabilty reasons.

______________________________________
Joe Kesselman  / IBM Research

Received on Friday, 20 October 2000 11:01:23 UTC