- From: David Brownell <David.Brownell@Eng.Sun.COM>
- Date: Fri, 1 May 1998 18:45:26 -0700
- To: www-dom@w3.org
Hi, I would like to give the following feedback and suggestions to the DOM working group. I'm not sure whether to apologize for not splitting it into lots of little messages like most other posts here! There are some problems that one notices when trying to implement to this specification. For example, the core and XML sections don't declare the exceptions that are thrown, and use illegal exception signature syntax. Module and package names are undefined (here and in the HTML section), so developers need to make up their own. I hope all these get fixed soon. Iterators are quite troublesome, and evidently don't support viable implementations. I notice that there's quite a lot of archived traffic on the iterators ... like, 50% of the April traffic! This strongly indicates it's a problem spot. My own radical solution is to delete iterators: they're not necessary. (An application writer who can't hack out a short filter method using Node methods is in trouble already; and they can surely take advantage of DTD knowledge to get better speed.) Plus, I'm concerned that the HTML section doesn't seem to take the same approach as the rest of the document. I think it needs to be presented in the same way: generic object model described in OMG-IDL, with the Java and ECMA Script bindings derived automatically from that. That'll prevent mismatches like event handling stuff existing at all (it's not in scope until a level 2 spec happens) and NamedNodeList having no API except in the IDL (with no explanation). The document will make a lot more sense that way, too: it'll seem like it's really a single document. Appended, please find more detailed comments. I've tried to organize these according to their location in the 16 April draft, and make specific suggestions for changes rather than make comments which would need more thought to transform to action. I've also tried to delete those points which a quick scan of April's archive shows are addressed. - David Brownell Section 2.6.5 Range ------------------- Please delete this, or else specify it well enough to be useful in a real application (with interface spec, etc). Section 2.8 ----------- General IDL comments: - Define the name of the module: "module org::w3c::dom { ... }" for the core, "module org::w3c::dom::xml { ... }" for XML, etc.; I forget the detailed syntax, sorry! - Please be consistent about exposing data as attributes rather than as get/set methods. I'd prefer using attributes in all cases (also including readonly attributes!) unless there's a need to report a special exception. For example, siblings and children. - The syntax for exception signatures uses "raises", and requires the exception to be declared. DOM ... Please delete the text about an expectation that these be static methods; far better to say that language mappings of this include ways to get instances of particular kinds of DOM objects. For example, the natural idiom would be to get an "HTML 3.2 DOM" object, or 4.0, or a generic XML DOM object, or one constrained a given XML DTD, or "my customized DOM". Moreover, the two language mappings included don't have these as static methods!! (Is the intent that the "type" string specify the DTD? It's underspecified.) Please delete the "hasFeature" method, or define the managed namepace of features! DocumentContext ... Please delete; as with Range, if it's going to do so little, leave it out. This adds no value that I can discern, and if a level 2 spec adds value with this notion, let it define the interface fully. Deleting this will call for deleting this attribute from Document, as well as a factory method on Document. DocumentFragment ... quite underutilized. See the comment for Node for one possible change which will not only make this more useful, but address another structural problem. In any case I think this notion is under-described. Please make "masterDoc" a readonly attribute, or else specify where in the sequence of the master document's children this new child should go. Document ... The "documentType" attribute clearly belongs in an XmlDocument interface, not here where it's specified to be meaningless except in the context of XML. These factory methods don't work very cleanly: * They don't let implementors throw exceptions like "this kind of document doesn't permit that kind of element". Such an "InvalidElementException" should be defined, and be throwable by createElement. * Similarly for attributes, and createAttribute: permit the factory methods to enforce validity, and throw an exception such as "InvalidAttributeException". * The methods don't automatically assign a parent; suggested suggested fix is on Node. The "createTreeIterator" method should be renamed so it doesn't look like a factory method. "getTreeIterator"? However, see the comments on NodeIterator and TreeIterator -- not needed, dangerous, please delete. Node ... Parentless nodes seem to come up in several cases: child nodes that have been removed or replaced, nodes that have just been constructed and not assigned a parent. But only document objects are allowed to return null as a parent. getParentNode() should be able to report a "NoParentException" for nodes without parents. As one notices when implementing it, Node is rather odd since it's got children but most of its subtypes (Text, PI, Comment) don't. In fact, the only things that need children are Document and Element. (Plus some XML attribute values, which I think should not be Node objects.) A much cleaner approach would be: - Element would extend the DocumentFragment interface; - DocumentFragment would have the child related methods: hasChildNodes, getFirstChild, insertbefore, removeChild, replacechild - (And to get rid of iterators, add DocumentFragment.getLastChild, and DocumentFragment.getNumberOfChildren; maybe getNthChild. See NodeIterator comments.) The kind of node that supports insert/replace/remove child methods should support policies that only valid Document objects get built. This means that whenever a node is added or removed, it should be able to throw something like an InvalidNodeException in cases where that operation would violate the DTD. Shouldn't there be a deleteChild(in Node oldChild) method? General comment: I don't see specifications for how changes to parents, siblings, and children affect other nodes. At some point before this gets more stable, that should be carefully specified. Please delete getChildNodes and the NodeIterator type. See comments on NodeIterator for more information. The NodeType constants (and getNodeType) are not needed, and constrain things more than seems wise. This interface would need revising to add new kinds of node to the system ... strongly to be avoided! In OMG-IDL, one normally relies on the run time typing provided by the ORB as part of the language mapping. For example, it can use Java or C++ run time typing; C systems add their own runtime typing (ORB runtimes, COM, what have you). Where is the spec for the ECMAScript language mapping which is being used? Seems like that's where stuff like NodeType info should be showing up -- and as a generic feature, not just for nodes. NodeIterator ... broken semantics, also overly complex The notion of "current position" being _between_ nodes is a huge problem; it should instead be _on_ a specific node, if this API remains in the DOM spec. If that node is deleted, the iterator might become invalid, but other edit operations have very clear semantics. The price of the current spec is high: - Tracking each iterator ever created, probably in the document object); - Adding "position info" to those iterators; - Potentially, updating each iterator's "position info" whenever the tree changes. - Needing to add an operation to say that the application is finished with the iterator; garbage collection alone doesn't work. That violates the normal usage model for the two languages whose bindings are in the spec! - It makes it very hard to describe the exact semantics of iteration in the case of the tree changing. The indexing should to be deleted ... it assumes that the set of nodes is fixed. But, it's not. You can't save an index, modify a tree, and expect things to work! The indexing is superfluous; just save the node in question, rather than the index. (Or, put it into a separate interface, that doesn't deal with "live update".) COMMENT: These iterators are well past the complexity line of what I like to see in specifications ... they're "icing" suited for one kind of cake, which isn't the kind I happen to bake. It's simple enough to do tree and child enumeration using the Node API, with none of the complexity of this Iterator. Only the "enumerate by tag" functionalty merits even a helper function in most code. Preferably, delete this from the spec and make "Node" just a bit simpler to use for this sort of purpose -- add getLastChild, and maybe getNumberOfChildren. (Perhaps to DocumentFragment instead; see above comments re Node.) Most useful would be adding a method to Node automating prefix inorder tree traversal, so it's about two lines of code to write an in-line filter. TreeIterator ... all the NodeIterator comments apply! Delete this. Given a "current node" notion, most of these methods are not needed: toParent, toFirstChild, toPreviousSibling, toNextSibling. The numPreviousSiblings and numNextSibling methods apply to the dangerous notion of indexing (in live data!) which I believe should be deleted. Similarly, toNthChild. Attribute ... No particular comments, except that I'm glad it's not a node! AttributeList ... I appreciate the simplicity of this interface compared to the NodeIterator and TreeIterator ones! Rather than the numeric indexed mode implied by item(), it'd be safer to just return an array of strings that identify the attributes. It might be good to have a readonly attribute giving access to the element the AttributeList is tied to. Element ... getAttributeNode, getAttributeName ... what's this "name" type? I translated it as "wstring". Is this for namespace support? setAttribute ... the IDL should say "wstring", not "string". getAttributes () ... should return AttributeList, not NodeIterator NodeIterator is way too complex, it's overused. Moreover, it's not possible (this way) to have getElementsByTagName ... returns no value. I'd argue to delete this, on the grounds that all the Iterator support should be deleted. Also, this sort of thing can be implemented easily enough in an application. Please move the normalize method to the Text object, so apps can have control over when they incur such costs. Please delete the AttributeNode representations here; there's no point in having both string and AttributeNode representations in the core. Anyway, the AttributeNode stuff can come from an AttributeList object anyway (see above). Might be good to add a boolean hasAttributes() method. Text The editing stuff seems overly complex for most usage, and probably isn't complex enough for heavy duty editor work. I think it'd be better if insert(), replace(), and splice() weren't there. I'd like to be able to access text as something other than a string; at least, I'd like to know how many characters are there, and be able to copy characters out directly without converting to strings. (It's been a while since I hacked the IDL; the latter may be painful to express!) I found the splice() method description confusing. Please try something more complete, perhaps like this for the brief explanation, but with complete description of how all four (!!!) possible outcomes change the parent, child, and sibling relationships. Or better yet, don't put this method in ... four different outcomes is troublesome: Modifies this text node by deleting characters beginning at "offset" and turning "element" into the next sibling. Any characters after "offset + count" become a new Text object that is the next sibling of "element". If nonzero "count" is specified, it identifies how many of characters beginning at "offset" are turned into a new Text object which becomes the last child of "element". See above comment re Element.normalize ... I think that functionality belongs on text nodes, offering control over how (and when) costs are incurred. (Also, which fragments get preserved!) For example, a boolean merge() method that returns true if this text node was able to merge with the next one. Comment ... I'd prefer to see them unable to have children! PI ... I'd prefer to see them unable to have children! I'm told HTML has no PIs (really??), so it's not clear to me why this is in the core object model. Particularly since one of the properties of the PI is defined as meaninglist for HTML... Section 3.2 ----------- XMLNode Please make this inherit from Node ... it's confusing otherwise Please make an XMLElement interface, inheriting from both XML Node and Element. Ditto XMLText (Text + XMLNode), XMLPI (if PI is still in core), etc. See comment above re duplicating the run time type information; is this not something that deserves a "NodeType" of its own, given the API framework that's been set up? (Of course, I think that duplicate run time typing support should be deleted.) Return types should be XMLNode (etc), not Node Building on the comments earlier, the XML child operations should perhaps be part of an XMLDocumentFragment interface. DocumentType Please change the return type to a "DTDNode" not a "Node; it's a different kind of information than application data, and should not be forced to be polymorphic with application data. If some implementation chooses to be polymorphic, so be it -- but that's not what the spec should be requiring. And in several cases, "Entity" matches the spec better without even defining a new DTDNode interface! ElementDefinition ... please extend "DTDNode" Please delete inclusions and exlusions, since XML doesn't have them and this isn't for HTML or SGML. AttributeDefinition ... please extend "DTDNode" XMLText ... doesn't exist Please remove the CDATASection and PCDATAToken empty interfaces, and replace them with predicates on XMLText. Similarly, add a predicate for whether the text is ignorable whitespace. (I just hate NOP interfaces, particularly when they don't even share the same naming convention!) Hmm, would a PCDATA and a CDATA node ever merge with normalize()? I suspect they shouldn't; regardless, specify the behaviour. Chapter 4 --------- Am I confused, or is the deal here that the body more or less is ECMA script text that needs to move to "Appendix C", and a new body must be created which holds (a) what's now in "Appendix A", and (b) a language neutral description of the semantics? That is, I sure hope this part of the spec starts to look a lot more like the core and HTML parts. Event handling methods like blur(), focus(), select(), and click() should all get deleted (event handling is a non-goal anyway, and fights against general requirement #6 of not requiring a UI). At most they should be in a non-normative appendix. I happened to notice that NamedNodeList is used a lot in the ECMAScript in the main body, but isn't defined until the IDL appendix. I suspect that happens with other things, too. Fixing this to have the IDL as the core of this chapter, with Java and ECMA Script bindings generatd from it, will help fix this problem. There seems to be no way to set attributes of a cookie other than its value ... such as DOMAIN, PATH, EXPIRY, and so on.
Received on Friday, 1 May 1998 21:45:35 UTC