- From: Stephen R. Savitzky <steve@rsv.ricoh.com>
- Date: 05 Oct 1999 13:55:32 -0700
- To: "Michael Champion" <michael_champion@ameritech.net>
- Cc: <www-dom@w3.org>
"Michael Champion" <michael_champion@ameritech.net> writes: > > THE DOM IS NOT AN OBJECT MODEL! It is a specification (API) for a class > > library. Specifically it is the API for the class library of Javascript. > > The Infoset is much closer to being a real object model, in that it > > specifies the necessary and sufficient set of interfaces that _any_ > > implementation of documents must, somehow, provide. > > That's quite true. On the other hand, the Infoset activity came into being > partly because the DOM working group struggled with the fact -- which was > not clear to us a couple of years ago -- that XML itself did not define a > specific object model (in your sense, the result of an analysis phase that > defines the basic set of objects and relationships, but not the API). The problem is that all of the DOM literature, including the introduction, suggests that the DOM is a great deal more universal than it actually is. It is no longer a basic set of objects and relationships, it's just an API. > > For the near term I will continue > > to base my application on my partial implementation of the DOM, and > because > > of its architecture it will always be able to manipulate DOM trees, but > > eventually my internal representation will cease to look anything like the > > DOM. > > I think this is more or less what EVERYONE who's implementing the DOM on top > of an existing application does. That's not what I was doing. What I was trying to do was take an existing document-processing application and make it use the DOM as its API for documents. This included changing my internal representation to one matching the DOM's expectations, and changing all my interfaces to an extension of the DOM. It turns out that this represented a disastrous mismatch. My goal is to totally revamp my internal interfaces so that that nobody will mistake what I have for an implementation of the DOM, which it never really was. I will be able to read and create DOM documents, but I won't ever have a representation for them, and I won't be trying to maintain a set of classes that attempt to implement the DOM interfaces. > There is no assumption, or even suggestion, that the DOM interfaces should > look anything like the underlying data structures. That's true in theory, but not in practice. The DOM is specified in a way that makes the choice of underlying data structures quite narrow if you want anything approaching efficiency. In some cases it's probably overconstrained to the point where no efficient implementation exists. > That's what distinguishes one DOM implementation from another -- the > suitability of the internal representation and manipulation of the XML > data for some specific type of task, be it browsing, editing, workflow, > database, etc. It makes sense to me to have a common API for common > operations in disparate applications, but it makes no sense to me to try > to find a common INTERNAL representation that meets such a range of > requirements. The DOM's API is insufficient for many of these tasks, and overspecified for the rest. Just to take one example, an editor needs access to the DTD, and it needs to be able to modify entity definitions. It needs to be able to redefine the default value of an attribute. It may need to let the author specify whether to leave off a redundant end tag in HTML. And on and on. Sure, I can do all of these things using the DOM _interfaces_, and I do, but my implementation does not conform to the DOM _specification_, and there's no way that it can or ever will. > > The > > reference set of applications should be specified -- applications outside > > this set _might_ be able to use the DOM, but if their requirements differ > > from those of the reference set their needs will simply not be considered. > > It should be made _very_ clear that extensions of any sort are not > > encouraged, perhaps not even permitted, and that implementors with a > > different set of requirements should seek elsewhere. > > Here I pretty much totally disagree. How could one usefully specify a > "reference set of applications" in an industry that's evolving as rapidly as > the Internet? Applications, and even the terms "e-commerce" and "portal" > were not in common use when work on the DOM began, but I submit that the DOM > is very relevant in both these areas. That's not what I mean by the reference set. I mean the types of programs for which the API is suitable: scripting languages, mostly in browsers, operating on a limited number of small documents all of which use the same well-known DTD. The DOM contains many components, features, and behavioral constraints that are not merely useless but totally wrong for applications that involve databases, large documents, small memory, streaming, editing, and so on. > Virtually all DOM implementors have added extensions (and I respectfully > disagree with Arnaud that this is "useless"), although this obviously > limits interoperability. It's Arnaud's view that seems to have prevailed. The DOM is pervaded by its need to be the specification for a library. Interoperability is a great thing, but it limits the range of applications. Even extensions don't work, because most of the things I want to do require _removing_ functionality and constraints from the DOM, not extending it by adding more. I don't want to add something like ownerElement to Attr, I want to allow structure sharing wherever it makes sense. I don't want magic children on EntityReference, I want firstChild to return null so that I can look up the Entity myself. I don't want the insane machinery that goes along with live nodelists. I want to be able to use a NodeIterator to iterate through the rules of a style sheet. And so on. > As many people have noted, one must use proprietary extensions to the DOM > to load and save data! I assure everyone that it was explicitly assumed > that implementors would extend the DOM to provide access to operations -- > such as DTD navigation -- that the DOM doesn't (yet) support. I have seen no evidence of this, and the DOM has consistently evolved to make this kind of thing less and less feasible. Putting it another way, the DOM evolves by _adding_ features and constraints. Each new addition further constrains the set of feasible implementations. <aside> This needs proof, of course, or at least support. OK: Adding ownerElement precludes structure sharing for Attr nodes, constraining the implementation. It is impossible to _add_ anything that _relaxes_ an existing constraint, so the set of constraints must be monotonically increasing. </aside> > All I suggest is that implementors who have functionality that overlaps > the DOM functionality provide DOM interfaces, add proprietary APIs where > their functionality exceeds the DOM's, and work with the W3C to > standardize the APIs as the base of overlapping functionality grows. The problem is that overlap isn't good enough. Every time some new constraint gets added to the DOM, some piece of your implementation is going to become non-compliant. Then you have to explain to whoever is maintaining the application that, no, the interfaces are DOM level 1 with a couple of extensions, but don't use getElementsByTagName because it's hopelessly inefficient. The DOM is OK if you are building an application from scratch that can live within the DOM's constraints, and you have an existing DOM implementation to start with. If you need to relax any of the DOM's constraints it's a mistake to base your implementation on the DOM's interfaces (as I did), because it creates the expectation that you will eventually come up with a full implementation. -- Stephen R. Savitzky <steve@rsv.ricoh.com> <http://rsv.ricoh.com/~steve/> Platform for Information Applications: <http://RiSource.org/PIA/> Chief Software Scientist, Ricoh Silicon Valley, Inc. Calif. Research Center voice: 650.496.5710 front desk: 650.496.5700 fax: 650.854.8740 home: <steve@theStarport.org> URL: http://theStarport.org/people/steve/
Received on Tuesday, 5 October 1999 16:56:06 UTC