Re: Comments on first read of DOM from Mike Champion on 1997-10-17 (www-dom@w3.org from October to December 1997)

From: Mike Champion <mcc@arbortext.com>
Date: Fri, 17 Oct 1997 14:03:26 -0400
To: Ray Whitmer <raywhitmer@itsnet.com>, www-dom@w3.org
Message-Id: <97Oct17.140221edt.18820@thicket.arbortext.com>
At 12:10 AM 10/17/97 -0400, Ray Whitmer wrote:

>I have had these comments reviewed by others knowledgeable in the
>DOM-like architectures, but they could still could easily contain silly
>misunderstandings, but here goes:

Don't sell yourself short! If a person with a significant amount of Java
and object model experience misunderstands the draft, it is our problem and
needs to be corrected. There's no such thing as a "silly" misunderstanding
in a standards document, IMHO.


>
>1.  The object descriptions describe a Node method called
>"getFirstChild", which is missing from the IDL and Java specifications.
>There should also probably be a "getLastChild".  In most cases for
>general traversal, these make it easier to traverse sets of siblings
>from
>either direction than getting a child list and then getting children out
>of
>the child list.
>

We try to tread a fine line between putting in all the necessary
functionality and not bloating the spec with "convenience functions" unless
there is a very strong case for assuming that those convenience functions
will be used VERY frequently or that a "native" implementation would be
must faster than implementing them on top of the rest of the DOM.  We also
force ourselves to consider whether a reasonable implementation of a
function would scale well to a document with (literally!) millions of
nodes. getLastChild would be a convenient function to have, but it COULD be
implemented by calling getFirst/getNextChild until you get to the end of
the siblings lists, I'm sure convinced that there is all that much demand
for it, and running it on a huge document would probably be very slow (even
in a "native" implementation).  So, this is something I'll make sure that
we consider, but I personally wouldn't push very hard for it at this point.
 Can you "sell" the idea to us more compellingly?


>2.  It is very convenient to be able to process children without keeping
>
>track of indices (which could even be hard to predict in a large
>changing set of siblings).  getFirstChild, getLastChild, getNextSibling,
>
>getPreviousSibling, getParentNode, all make this possible.  But
>unfortunately, the insertion, removal, or replacing of siblings or
>children suddenly forces the inserter or remover to be aware of the
>current index number, which may be unknown.  At the very least, a
>method getSiblingIndex() would make this process much easier.  Even
>better, might be a remove/replace call that accepted just the node to
>be removed/replaced instead of its index, and an insertBefore and an
>insertAfter call that allowed you to insert before or after an arbitrary
>
>node.  I believe this is easily implementable even if the node objects
>are aliases (see 3. below).
>

I agree with most of what you've said.  In the next draft, we'll probably
use "reference nodes" rather than indices in the "Child" functions on nodes
to better abstract the underlying document implementation, which may be
array-based (as implied by the previous API spec), or linked list-based,
which is what most systems that handle huge documents use.

>3.  The uniqueness of object identity should be made clear.  Since these
>
>are interfaces that may be aliases created to access the real objects,
>two
>calls to query the same object could return different objects, i.e. if
>you
>have a next sibling, this.getNextSibling().getPreviousSibling() might
>not
>be guaranteed to be == this, but it would be functionally equivalent,
>and
>perhaps a function would be useful to show that two nodes are actually
>the same node.  If two aliases which represent the same node must always
>
>be the same object, (that can also be implemented with more difficulty
>because aliases must be tracked and not simply garbage collected) then
>that could be clearly stated in the specification.

Good point.  We can clarify this, I believe.

>
>4.  In the description of the Node method called "getChildren", it says
>that on the one hand, if there are no children, a null is returned, but
>otherwise the node list is "live" so it may dynamically reflect changes.
>
>It seems to me that since it is live, it would be more consistent to
>return an empty list in the case of no children instead of returning
>null,
>since the list is a live one even if empty.  The "hasChildren" method
>will allow this case to be easily avoided where this live update is not
>important to the applicaion.  The issue of live update should be made
>clear throughout the specification -- especially in enumerations.  I
>suspect that my own implementation of the specification would allow a
>constant document to be gotten for those extended operations that should
>
>never modify the document and require consistency over an extended
>period of time.
>

Obviously this needs to be clarified in the spec.

>5.  The issue of case sensitivity is first raised, and then the burden
>is placed on the application writer, but in the case of
>getElementsByTagName or any future query mechanism, the DOM
>implementation must be aware of case sensetivity or insensetivity.

Unfortunately, the DOM charter calls on us to support XML and HTML, and
these two languages have different rules for case sensitivity.  The only
solution that we could come up with was to place the burden on the script
writer to do case-insensitive compares in HTML and case-sensitive compares
in XML.  So, we're "sensitive" to your concerns, but are pretty much stuck
with the current solution to the problem.

>6.  I would assume from the specification, but it is not clearly stated,
>
>that white space which may be ignorable which is ajacent to
>non-whitespace would create distinct ajacent text nodes.  This seems a
>bit confused between different descriptions in the specification whether
>
>the whitespace flag says ignore any whitespace that happens to be
>leading/trailing here, or this whole text may be ignored.


Another frequent point of confusion!  This will be clarified in the next
draft.

>7.  It might be useful to clarify  the expected behavior of traversal
>and other methods of a node and its children which has been removed from
>
>the document root, but not reinserted.
>

OK.

>8.  Instead of making explicit setters/getters on Document for the root
>node and the type node, it might be more general to have a simple
>getNamedNode/setNamedNode with standard names for the default type
>root, the default document root (using a NamedNodeList underneath).
>Of course, the more general routine could always be added later when it
>becomes clear (as it already is to me) that multiple arbitrary named
>roots are very useful without hurting the existing more-explicit
>accessors.

I'll have to think about this ... 

>I can hardly wait for the specification to become the accepted standard!

That's what we're working toward ... Thanks for your very useful suggestions.

Mike Champion
Received on Friday, 17 October 1997 14:03:01 UTC