- From: Ray D. Whitmer <rayw@imall.com>
- Date: Mon, 25 Oct 1999 14:24:47 -0600 (MDT)
- To: "Box, Don" <dbox@develop.com>
- cc: "'www-dom@w3.org'" <www-dom@w3.org>, Philippe Le Hegaret <plh@w3.org>
On Sun, 24 Oct 1999, Box, Don wrote: >This makes it extremely hard to write Java code that: > >a) Allows users to link in their own XML parser that implements org.w3c.* I believe the Java spec has a very robust class activation mechanism. Even if a class was compiled against a level 2 interface, it can run against a level 1 implementation as long as it calls no level 2 methods. Users may still easily compile against level 1 interface classes if they desire, and run against either level 1 or level 2 implementations. Java does not resort to linking. >b) Mixes Level 1 and Level 2 XML parsers in the same VM. I believe the Java spec allows free intermixing of implementations and applications where the interface against which they were compiled does not necessarily match the one found in the VM where they run. The worst that happens if a user attempts to call a method that an implementation did not supply is that an exception is thrown at that point. As long as the Java spec is correctly implemented. Level 1 applications will simply always work against the level 2 implementation, and level 2 applications can easily perform one-time checks to make certain the desired functionality is there. The highest-level version of the apis just needs to appear first in the class path. jdk 1.1 from Sun had some problems -- not just this one :-), but jdk 1.2 from Sun is correctly implemented, and I have proven to myself experimentally that you can freely intermix implementations and applications of level 1 and level 2 DOM in the same VM in all cases. If you can prove otherwise, please supply the example. >Yes, I know about DOMImplementation.supports, but I am sorry, this is a >hack. I strongly suggest you hoist the new Level 2 methods into new >interrfaces that extend the Level 1 versions. This could be done as simply >as spawning off another package name, as follows: Certain optional extensions to DOM have been defined as new interfaces in Java. I feel that these cases are different from the common basic ones you are objecting to which are done by simple extension to the existing interface. Java, JavaScript, and certain other languages go to great lengths to give the user a more-unified non-fragile object model. There are other language bindings where new interfaces is the best way to go, but even these languages should encourage users to use the supports method to get a less-fragmented and more-timely picture of the underlying support supplied by the implementation. I like Java for the dynamic object model that C++ never provided. You don't help a Java API by fragmenting it. Lots of interfaces on a single object produces a decaying vtable, and the more you have, the more you destroy rather than help the type safety and approach a model where you look up every method before you call it. Over time, there will be more levels of DOM, I believe, which would cause further fragmentation if that model were used -- and it would be bad enough just for level 2. Take, for example, the new functionality of namespaces, which will be used literally in every loop and traversal of the hierarchy (at least in all my use cases). Do you really want to be forced to test and type cast every node returned by a level 1 traversal method just to check the namespace? Or force implementations to seperately support level 1 and level 2 interfaces with all the methods duplicated? The alternatives to simply extending the existing interfaces are very ugly and would make me doubt whether I want to use the result in Java. Why bother, when Java solves the problem much more elegantly. You find similar cases in the JDK 1.2 where new methods are simply added to an existing 1.1 base and it hurts nothing. Yet with all the problems it would cause, the fragmented interfaces still cannot even guarantee, for example, that level 2 nodes will only have level 2 nodes as children, or that a tree has the necessary functionality that the creator of the tree needed. It would be quite frustrating to execute a lengthy set of manipulations on a document only to discover on the last manipulation that an object, which may not have existed until that point, didn't support a critical interface that was essential to complete the processing. The "supports" concept and a contract guaranteed by the spec rather than by a fragmented collection of interfaces seems superior because it defines a more usable concept of an implementation's support level -- one that happened by design in the spec rather than happening from a set of unclear relationships between interfaces. If an application waits until it needs to to call the method to see whether it happens through random good fortune to exist, a failure to find the right interface is often no-more catastrophic than the exception that Java would throw. There is a supports method on every node you can use immediately before you try to call the method. But there is a better technique which can tell you in advance and save lots of typecasting and checking later. >This allows me to do the more Java-esque test for extended functionality: > >public void f(org.w3c.dom.Attr attr) >{ > if (attr instanceof org.w3c.dom.level2.Attr) { > // use level 2 functionality here > } > else > { > // fall back on Level 1 functionality here > } >} This example is not what I would call good practice, if it occurs commonly. Any place dynamic type checking amd type casting occurs in my code, I am suspicious that the object model is broken. I find this example a strong case against using new interfaces everywhere, because it appears to be such a common case. While DOM does occasionally rely on typecasting, by and large, Node has enough methods to avoid typecasting in the common cases. Also, neither level 1 nor level 2 of the DOM guarantees that you know what type of object you have by asking instanceof. That is why the nodeType getter is available on the node. That you find "instanceof" more Java-esque doesn't mean it will be reliable since there is nothing in the spec that says a particular object might not, for example, implement a CDataSection, a Text, and a Comment, all simultaneously. Implementations are free to do that, and have been free to do that since level 1. Using supports, an application can make a single test once and know for the whole rest of the document whether the hierarchy had level 2 or level 1 nodes in it, and simply call the methods without typecasting. >For someone implementing a Level 2 parser, one can simply ignore the level 1 >names as folows: > >import org.w3c.dom.level2.Element; >public class MyElement implements Element >{ >// implement union of Level 1 and Level 2 methods >} > > >I really really feel this is an important issue. I have personally written >code that loads multiple XML parsers and in the face of Level 1 and Level 2 >parsers, I see trouble. It is also not that far fetched to come up with >scenarios where this can happen (a web server that loads independently >developed servlets that each use a different XML parser). Those on the working group also find that this is a very important issue and spent a lot of time exploring alternatives before deciding to extend the APIs in the way they have been extended and are anxious to receive feedback. The alternative you seem to suggest here of replicating the interfaces and thereby making each object implement two versions of every level 1 routine was thought to be too heavyweight -- actually you haven't made it clear in this last example whether you inherit the level 2 interface from the level 1 interface forcing typecasting everywhere a level 1 interface is returned by a level 1 method, or whether you force a completely seperate level 2 implementation, forcing lots of redundant code and mangling of names to allow two versions of a method that took all the same arguments to returning either a level 1 or a level 2 version of an object, or whether you solve that problem by breaking typecasting between implementations. I find your one example does not support the apparent premise that using "supports" is somehow a hack and typecasting for every new function is the preferred way to go. I suspect I'd start using Smalltalk or JavaScript instead of Java before I'd personally accept that premise, because I find having to cast on nearly every line of an application to be useless and broken. Any language binding for which the multiple interfaces approach is the best way to extend DOM is perfectly correct to take that approach. Perhaps you can give more compelling examples of why you feel that Java should take that approach, or clarify something? Ray Whitmer ray@imall.com
Received on Monday, 25 October 1999 16:25:08 UTC