- From: Stephen R. Savitzky <steve@rsv.ricoh.com>
- Date: 05 Oct 1999 11:50:58 -0700
- To: www-dom@w3.org
I am finally beginning to understand what Arnaud Le Hors <lehors@w3.org> meant when he wrote: > Jeff Mackay wrote: > > > > Are implementors allowed to extend the NodeType and Exception lists? > > Implementors can do whatever they want. However, the whole purpose of > the DOM is to provide users with an interoperable API. Implementing > and/or using any extension makes this pretty useless. THE DOM IS NOT AN OBJECT MODEL! It is a specification (API) for a class library. Specifically it is the API for the class library of Javascript. The Infoset is much closer to being a real object model, in that it specifies the necessary and sufficient set of interfaces that _any_ implementation of documents must, somehow, provide. An object model is the product of the analysis phase of an OO project; an API is the product of the design phase. An API is the specification for a library, which is the product of the implementation phase. In a library's specification, like the Java class library or the DOM, it is vitally important to specify a complete set of functions and to nail down the implementation as much as possible, so that application-writers have a rich set of operations with semantics they can count on. That's what an API (the programmer's view of a library) is all about. Extensions and experimentation are out of place in this context. The typical application- writer is using a canned library supplied by a language vendor, and expects a consistent environment on every platform of interest. That's a good thing. In a _real_ object model like a GUI toolkit or the Infoset, on the other hand, it is important to provide only the _minimum_ interface, and to constrain the implementation as _little_ as possible, so as to provide for the widest possible range of applications. A real object model is essentially the basis for a framework; extensions are allowed for and indeed expected. With a framework, application writers are expected to get their hands dirty and at least _look_ at the code, if not modify it; usually the application and the classes that implement the object model are written by the same person or group. The object model's main function is to ensure that no important details are left out of the implementation. That's a good thing, too, but it's a _different_ thing. An object model is a specification, just as an API is, but at a different level. It is further removed from the implementation, and is not directly useable by an application-writer. A good object model simply defines the set of interfaces that are necessary and sufficient in order to to _represent_ the data (documents, in this case) being modeled -- the objects and attributes that any implementation must provide, and that any application can count on having available. The object model, in other words, specifies the objects' attributes, very little about their behavior and as little as possible their implementation. An object model should make no claims about whether nodelists are ``live'' or static, about whether or not nodes can be freely moved between documents, about whether documents may be traversed in any particular order, about whether structure can be shared, or whether a node remains accessible after an application has abandoned all references to it. It should simply ensure that, if you are looking at a node in a document, you can tell what sort of node it is and determine _all_ of the relevant information about it. The DOM, by contrast, makes no real attempt to be a complete object model for documents. Converting a document to a DOM tree loses information; it is no longer possible to recover the original document. It is impossible to create an arbitrary XML or HTML document, say inside of an editor, and write it out as its author intended. There may be some documents that cannot be represented at all, perhaps due to their size or to their dynamic nature. There are many plausible representations for documents that do not conform to the DOM but are nevertheless useful, and which would benefit from a unifying standard to guide their implementors. In fact, the DOM itself would have benefitted greatly from such a standard, not to mention a vigorous application of Occam's Razor. All of this suggests that, for my own sanity and for the sake of my application, I should probably abandon any hope or pretense of using the DOM. I need DTD's, I need SGML, I need late-bound entities and entity references without content, I need application-specific, strongly-typed metadata, I need the ability to stream large documents through a document processor with limited memory, and so on. For the near term I will continue to base my application on my partial implementation of the DOM, and because of its architecture it will always be able to manipulate DOM trees, but eventually my internal representation will cease to look anything like the DOM. As the official Javascript class library for browsers, the DOM is simply irrelevant for an XML-based extensible server. Most of my comments in this list over the last year or so have been based on the mistaken belief that the DOM was an object model. I think a great deal of confusion could have been avoided if the introduction clearly stated that, although the DOM may be moderately language-neutral, it is far from implementation neutral and that the primary goal is to provide a stable class library optimized for a certain specific class of applications. The reference set of applications should be specified -- applications outside this set _might_ be able to use the DOM, but if their requirements differ from those of the reference set their needs will simply not be considered. It should be made _very_ clear that extensions of any sort are not encouraged, perhaps not even permitted, and that implementors with a different set of requirements should seek elsewhere. It is far too late to rename DOM -> Browser Scripting Document API, but it would have been far more accurate. I think that a document object model (note the indefinite article and lower-case letters) would be a good idea, and I will gladly support and contribute to an effort to construct one. -- Stephen R. Savitzky <steve@rsv.ricoh.com> <http://rsv.ricoh.com/~steve/> Platform for Information Applications: <http://RiSource.org/PIA/> Chief Software Scientist, Ricoh Silicon Valley, Inc. Calif. Research Center voice: 650.496.5710 front desk: 650.496.5700 fax: 650.854.8740 home: <steve@theStarport.org> URL: http://theStarport.org/people/steve/
Received on Tuesday, 5 October 1999 14:51:35 UTC