- From: Stephen R. Savitzky <steve@rsv.ricoh.com>
- Date: 06 Oct 1999 08:38:44 -0700
- To: Philippe Le Hegaret <plh@w3.org>
- Cc: WWW DOM <www-dom@w3.org>
Philippe Le Hegaret <plh@w3.org> writes: > Stephen R. Savitzky wrote: > > THE DOM IS NOT AN OBJECT MODEL! It is a specification (API) for a class > > library. > > In http://www.w3.org/TR/REC-DOM-Level-1/introduction.html > "The Document Object Model (DOM) is an application programming interface > (API) for HTML and XML documents." QED. > > The Infoset is much closer to being a real object model, in that it > > specifies the necessary and sufficient set of interfaces that _any_ > > implementation of documents must, somehow, provide. > > In http://www.w3.org/TR/xml-infoset#intro > > "This document specifies an abstract data set called the XML information set > (Infoset), a description of the information available in a well-formed XML > document" > > So, it's not _any_ implementation of documents, but _any_ implementation > of XML documents. Point taken. It's still a lot closer to a general object model for documents than the DOM is. > > It is impossible to create an arbitrary XML or HTML document, say inside of > > an editor, and write it out as its author intended. > Do you have an example ? Sure. As a web author, I might want to attempt to foil spammers by representing my e-mail address as <steve@rsv.ricoh.com> -- note also the symmetric use of < and >. A conforming DOM implementation will render this as <steve@rsv.ricoh.com>, defeating my intentions. To take another example, I may want HTML lists to be output in the ``traditional'' format with omitted end tags: <ol> <li> <li> </ol> The DOM has no way to represent the fact that the end tags have been omitted. For various reasons I may wish to omit end tags in one place, but keep them in in another (perhaps as a flag to some string-based Perl script that modifies the file in some way). This is a perfectly legitimate thing to do in a text editor, but it's impossible in an editor based on the DOM. Similarly, for stylistic reasons I may wish to distinguish between XML elements that are declared as empty, and those that are not but are simply empty ``by accident''. I would represent the first as <foo/> and the second as <bar></bar>. This has no effect on the semantics of the document, of course, but as an author using an editor it serves as an invaluable reminder of which empty elements it is permissible to fill in later. > > There may be some documents that cannot be represented at all, perhaps > > due to their size or to their dynamic nature. > If you mean a document which is not XML or HTML, you're right. It's out > of the scope of the DOM. No. In the first case I mean a document which is too large for its tree to fit in memory. It may even be effectively infinite; for example, the output of a process such as a web crawler. In the second case, I mean a document in which external entities may have their value changed because of the actions of some other process. Possibly the simplest example of this is &time;, which I might want to reflect the exact time when the entity is expanded. Another example might involve an external file. > But, if you really want to add your <% script %> node in the DOM, > write an extension, it's very easy to do : > > interface StephenNode : Node { > readonly attribute unsigned short stephenType; > } > > const unsigned short SCRIPT_NODE = 0; > // ... > > interface ScriptNode : StephenNode { > // whatever you want > } > > I don't see a statement in the DOM about "you should not create your own > extension based on the DOM core". Then I would have to rewrite my application to cast all nodes as StepenNode and test stephenType instead of nodeType. It's ugly. > > I need DTD's, > > It's in our requirements. > See http://www.w3.org/TR/WD-DOM/requirements#ID-1072425801 But if I can't, within the specification, define new node types I can't write experimental code that won't have to be rewritten if you finally get around to fulfilling those requirements. Also, a frozen set of node types will influence you to use a brand-new interface class that doesn't descend from Node (which has already been done for CSSRule and CSSRuleList). Why aren't these descended from Node? > > I need SGML > > It's out of our scope. > See http://www.w3.org/TR/REC-DOM-Level-1/introduction.html Exactly. The scope of the DOM is too limited; I need an object model that can be extended to handle other situations and still be compliant with its specification. > > I need late-bound entities and entity references without content, I need > > application-specific, strongly-typed metadata > Once again, it's in our list. But How can we address stronglgy-typed > metadata without a recommandation ? The XML Schema datatype is not yet > a recommandation : > http://www.w3.org/TR/xmlschema-2/ If you had the ability to define new application-specific node types, you could simply add attribute NodeList metadata; to Node and let the application take care of it. > > I need the ability to stream large documents through a document processor > > with limited memory, and so on. > > In http://www.w3.org/TR/REC-DOM-Level-1/introduction.html > > "One important objective for the Document Object Model is to provide a > standard programming interface that can be used in a wide variety of > environments and applications." > Our main goal is interoperability, not memory. But if we can have both, > it's better. This is exactly my point. It's no longer possible to have both -- the DOM has taken a memory-intensive path in order to provide a rich interface. There needs to be an alternative for those of us who want to make different design decisions without giving up compliance with _some_ non-DOM standard. > > It is far too late to rename DOM -> Browser Scripting Document API, but it > > would have been far more accurate. > > Browsers represent 10% in the number of participants in the DOM WG. There > are several implementations of DOM in Java, C++, Delphi, Perl, Python, C. The > DOM is definitively not only a Browser Scripting Document API. > Browser scripting is one of our goals, but not the only one. It is the _reference_ goal. Whatever the DOM becomes, one of its ironclad requirements is that it has to remain the document-processing API for Javascript. Everything else may be subject to reconsideration, but not that. That's as it should be: Javascript needs an API for documents, the DOM is it, and if any other application finds it useful, that's great. But don't expect _every_ application to find it a good match. -- Stephen R. Savitzky <steve@rsv.ricoh.com> <http://rsv.ricoh.com/~steve/> Platform for Information Applications: <http://RiSource.org/PIA/> Chief Software Scientist, Ricoh Silicon Valley, Inc. Calif. Research Center voice: 650.496.5710 front desk: 650.496.5700 fax: 650.854.8740 home: <steve@theStarport.org> URL: http://theStarport.org/people/steve/
Received on Wednesday, 6 October 1999 11:39:21 UTC