- From: David Brownell <David.Brownell@Eng.Sun.COM>
- Date: Fri, 1 May 1998 18:45:26 -0700
- To: www-dom@w3.org
Hi,
I would like to give the following feedback and suggestions to the DOM
working group. I'm not sure whether to apologize for not splitting it
into lots of little messages like most other posts here!
There are some problems that one notices when trying to implement to this
specification. For example, the core and XML sections don't declare the
exceptions that are thrown, and use illegal exception signature syntax.
Module and package names are undefined (here and in the HTML section), so
developers need to make up their own. I hope all these get fixed soon.
Iterators are quite troublesome, and evidently don't support viable
implementations. I notice that there's quite a lot of archived traffic
on the iterators ... like, 50% of the April traffic! This strongly
indicates it's a problem spot. My own radical solution is to delete
iterators: they're not necessary. (An application writer who can't hack
out a short filter method using Node methods is in trouble already; and
they can surely take advantage of DTD knowledge to get better speed.)
Plus, I'm concerned that the HTML section doesn't seem to take the same
approach as the rest of the document. I think it needs to be presented in
the same way: generic object model described in OMG-IDL, with the Java
and ECMA Script bindings derived automatically from that. That'll prevent
mismatches like event handling stuff existing at all (it's not in scope
until a level 2 spec happens) and NamedNodeList having no API except in
the IDL (with no explanation). The document will make a lot more sense
that way, too: it'll seem like it's really a single document.
Appended, please find more detailed comments. I've tried to organize
these according to their location in the 16 April draft, and make specific
suggestions for changes rather than make comments which would need more
thought to transform to action. I've also tried to delete those points
which a quick scan of April's archive shows are addressed.
- David Brownell
Section 2.6.5 Range
-------------------
Please delete this, or else specify it well enough to be useful in
a real application (with interface spec, etc).
Section 2.8
-----------
General IDL comments:
- Define the name of the module: "module org::w3c::dom { ... }" for
the core, "module org::w3c::dom::xml { ... }" for XML, etc.; I
forget the detailed syntax, sorry!
- Please be consistent about exposing data as attributes rather than
as get/set methods. I'd prefer using attributes in all cases (also
including readonly attributes!) unless there's a need to report a
special exception. For example, siblings and children.
- The syntax for exception signatures uses "raises", and requires the
exception to be declared.
DOM ...
Please delete the text about an expectation that these be static
methods; far better to say that language mappings of this include ways
to get instances of particular kinds of DOM objects. For example, the
natural idiom would be to get an "HTML 3.2 DOM" object, or 4.0, or a
generic XML DOM object, or one constrained a given XML DTD, or "my
customized DOM". Moreover, the two language mappings included don't
have these as static methods!! (Is the intent that the "type" string
specify the DTD? It's underspecified.)
Please delete the "hasFeature" method, or define the managed namepace
of features!
DocumentContext ...
Please delete; as with Range, if it's going to do so little, leave
it out. This adds no value that I can discern, and if a level 2 spec
adds value with this notion, let it define the interface fully.
Deleting this will call for deleting this attribute from Document,
as well as a factory method on Document.
DocumentFragment ... quite underutilized.
See the comment for Node for one possible change which will not only
make this more useful, but address another structural problem. In
any case I think this notion is under-described.
Please make "masterDoc" a readonly attribute, or else specify where
in the sequence of the master document's children this new child
should go.
Document ...
The "documentType" attribute clearly belongs in an XmlDocument
interface, not here where it's specified to be meaningless
except in the context of XML.
These factory methods don't work very cleanly:
* They don't let implementors throw exceptions like "this kind
of document doesn't permit that kind of element". Such an
"InvalidElementException" should be defined, and be throwable
by createElement.
* Similarly for attributes, and createAttribute: permit the
factory methods to enforce validity, and throw an exception
such as "InvalidAttributeException".
* The methods don't automatically assign a parent; suggested
suggested fix is on Node.
The "createTreeIterator" method should be renamed so it doesn't
look like a factory method. "getTreeIterator"? However, see the
comments on NodeIterator and TreeIterator -- not needed, dangerous,
please delete.
Node ...
Parentless nodes seem to come up in several cases: child nodes that
have been removed or replaced, nodes that have just been constructed
and not assigned a parent. But only document objects are allowed
to return null as a parent. getParentNode() should be able to report
a "NoParentException" for nodes without parents.
As one notices when implementing it, Node is rather odd since it's got
children but most of its subtypes (Text, PI, Comment) don't. In fact,
the only things that need children are Document and Element. (Plus some
XML attribute values, which I think should not be Node objects.) A
much cleaner approach would be:
- Element would extend the DocumentFragment interface;
- DocumentFragment would have the child related methods:
hasChildNodes, getFirstChild,
insertbefore, removeChild, replacechild
- (And to get rid of iterators, add DocumentFragment.getLastChild,
and DocumentFragment.getNumberOfChildren; maybe getNthChild.
See NodeIterator comments.)
The kind of node that supports insert/replace/remove child methods
should support policies that only valid Document objects get built.
This means that whenever a node is added or removed, it should be
able to throw something like an InvalidNodeException in cases
where that operation would violate the DTD.
Shouldn't there be a deleteChild(in Node oldChild) method?
General comment: I don't see specifications for how changes to
parents, siblings, and children affect other nodes. At some point
before this gets more stable, that should be carefully specified.
Please delete getChildNodes and the NodeIterator type. See comments
on NodeIterator for more information.
The NodeType constants (and getNodeType) are not needed, and constrain
things more than seems wise. This interface would need revising to add
new kinds of node to the system ... strongly to be avoided! In OMG-IDL,
one normally relies on the run time typing provided by the ORB as part
of the language mapping. For example, it can use Java or C++ run time
typing; C systems add their own runtime typing (ORB runtimes, COM,
what have you). Where is the spec for the ECMAScript language mapping
which is being used? Seems like that's where stuff like NodeType info
should be showing up -- and as a generic feature, not just for nodes.
NodeIterator ... broken semantics, also overly complex
The notion of "current position" being _between_ nodes is a huge
problem; it should instead be _on_ a specific node, if this API
remains in the DOM spec. If that node is deleted, the iterator
might become invalid, but other edit operations have very clear
semantics. The price of the current spec is high:
- Tracking each iterator ever created, probably in the
document object);
- Adding "position info" to those iterators;
- Potentially, updating each iterator's "position info" whenever
the tree changes.
- Needing to add an operation to say that the application is
finished with the iterator; garbage collection alone doesn't
work. That violates the normal usage model for the two
languages whose bindings are in the spec!
- It makes it very hard to describe the exact semantics of
iteration in the case of the tree changing.
The indexing should to be deleted ... it assumes that the set of
nodes is fixed. But, it's not. You can't save an index, modify a
tree, and expect things to work! The indexing is superfluous; just
save the node in question, rather than the index. (Or, put it
into a separate interface, that doesn't deal with "live update".)
COMMENT: These iterators are well past the complexity line of what
I like to see in specifications ... they're "icing" suited for one
kind of cake, which isn't the kind I happen to bake. It's simple
enough to do tree and child enumeration using the Node API, with
none of the complexity of this Iterator. Only the "enumerate by
tag" functionalty merits even a helper function in most code.
Preferably, delete this from the spec and make "Node" just a bit
simpler to use for this sort of purpose -- add getLastChild, and
maybe getNumberOfChildren. (Perhaps to DocumentFragment instead;
see above comments re Node.) Most useful would be adding a method
to Node automating prefix inorder tree traversal, so it's about
two lines of code to write an in-line filter.
TreeIterator ... all the NodeIterator comments apply! Delete this.
Given a "current node" notion, most of these methods are not
needed: toParent, toFirstChild, toPreviousSibling, toNextSibling.
The numPreviousSiblings and numNextSibling methods apply to the
dangerous notion of indexing (in live data!) which I believe
should be deleted. Similarly, toNthChild.
Attribute ...
No particular comments, except that I'm glad it's not a node!
AttributeList ...
I appreciate the simplicity of this interface compared to the
NodeIterator and TreeIterator ones!
Rather than the numeric indexed mode implied by item(), it'd
be safer to just return an array of strings that identify the
attributes.
It might be good to have a readonly attribute giving access to
the element the AttributeList is tied to.
Element ...
getAttributeNode, getAttributeName ... what's this "name" type? I
translated it as "wstring". Is this for namespace support?
setAttribute ... the IDL should say "wstring", not "string".
getAttributes () ... should return AttributeList, not NodeIterator
NodeIterator is way too complex, it's overused. Moreover, it's
not possible (this way) to have
getElementsByTagName ... returns no value. I'd argue to delete this,
on the grounds that all the Iterator support should be deleted. Also,
this sort of thing can be implemented easily enough in an application.
Please move the normalize method to the Text object, so apps can
have control over when they incur such costs.
Please delete the AttributeNode representations here; there's no
point in having both string and AttributeNode representations in
the core. Anyway, the AttributeNode stuff can come from an
AttributeList object anyway (see above).
Might be good to add a boolean hasAttributes() method.
Text
The editing stuff seems overly complex for most usage, and probably
isn't complex enough for heavy duty editor work. I think it'd
be better if insert(), replace(), and splice() weren't there.
I'd like to be able to access text as something other than a string;
at least, I'd like to know how many characters are there, and be
able to copy characters out directly without converting to strings.
(It's been a while since I hacked the IDL; the latter may be painful
to express!)
I found the splice() method description confusing. Please try something
more complete, perhaps like this for the brief explanation, but with
complete description of how all four (!!!) possible outcomes change the
parent, child, and sibling relationships. Or better yet, don't put
this method in ... four different outcomes is troublesome:
Modifies this text node by deleting characters beginning at "offset"
and turning "element" into the next sibling. Any characters after
"offset + count" become a new Text object that is the next sibling
of "element". If nonzero "count" is specified, it identifies how
many of characters beginning at "offset" are turned into a new Text
object which becomes the last child of "element".
See above comment re Element.normalize ... I think that functionality
belongs on text nodes, offering control over how (and when) costs are
incurred. (Also, which fragments get preserved!) For example, a
boolean merge() method that returns true if this text node was able to
merge with the next one.
Comment ... I'd prefer to see them unable to have children!
PI ... I'd prefer to see them unable to have children!
I'm told HTML has no PIs (really??), so it's not clear to me why this
is in the core object model. Particularly since one of the properties
of the PI is defined as meaninglist for HTML...
Section 3.2
-----------
XMLNode
Please make this inherit from Node ... it's confusing otherwise
Please make an XMLElement interface, inheriting from both XML Node
and Element. Ditto XMLText (Text + XMLNode), XMLPI (if PI is still
in core), etc.
See comment above re duplicating the run time type information; is
this not something that deserves a "NodeType" of its own, given the
API framework that's been set up? (Of course, I think that duplicate
run time typing support should be deleted.)
Return types should be XMLNode (etc), not Node
Building on the comments earlier, the XML child operations should
perhaps be part of an XMLDocumentFragment interface.
DocumentType
Please change the return type to a "DTDNode" not a "Node; it's
a different kind of information than application data, and should
not be forced to be polymorphic with application data. If some
implementation chooses to be polymorphic, so be it -- but that's
not what the spec should be requiring.
And in several cases, "Entity" matches the spec better without even
defining a new DTDNode interface!
ElementDefinition ... please extend "DTDNode"
Please delete inclusions and exlusions, since XML doesn't have them
and this isn't for HTML or SGML.
AttributeDefinition ... please extend "DTDNode"
XMLText ... doesn't exist
Please remove the CDATASection and PCDATAToken empty interfaces,
and replace them with predicates on XMLText. Similarly, add a
predicate for whether the text is ignorable whitespace. (I just
hate NOP interfaces, particularly when they don't even share the
same naming convention!)
Hmm, would a PCDATA and a CDATA node ever merge with normalize()?
I suspect they shouldn't; regardless, specify the behaviour.
Chapter 4
---------
Am I confused, or is the deal here that the body more or less is ECMA
script text that needs to move to "Appendix C", and a new body must be
created which holds (a) what's now in "Appendix A", and (b) a language
neutral description of the semantics? That is, I sure hope this part
of the spec starts to look a lot more like the core and HTML parts.
Event handling methods like blur(), focus(), select(), and click() should
all get deleted (event handling is a non-goal anyway, and fights against
general requirement #6 of not requiring a UI). At most they should be
in a non-normative appendix.
I happened to notice that NamedNodeList is used a lot in the ECMAScript
in the main body, but isn't defined until the IDL appendix. I suspect
that happens with other things, too. Fixing this to have the IDL as the
core of this chapter, with Java and ECMA Script bindings generatd from
it, will help fix this problem.
There seems to be no way to set attributes of a cookie other than its
value ... such as DOMAIN, PATH, EXPIRY, and so on.
Received on Friday, 1 May 1998 21:45:35 UTC