- From: David Brownell <david-b@pacbell.net>
- Date: Fri, 1 Sep 2000 09:45:04 -0700
- To: <lauren@sqwest.bc.ca>, <www-dom@w3.org>
Lauren, since DOM doesn't include parsing APIs, I don't know how you can claim "you'll get the same trees" ... all APIs to turn *ML text into a DOM tree are by definition proprietary at this time (or at least must be promulgated by some non-W3C group). Admittedly there are some assertions floating around in the text, but since there is no DOM API for parsing to which they could apply, there doesn't seem to be anything that could be tested for purposes of saying "yes, this conforms to the DOM L2 spec" ... or "no, this doesn't, because it doesn't return the right results from this specific API". Agreed that malformed HTML is a horrific problem, which was a significant motivator for XML! :-) XML was specified so that conformant "processors" can be defined; but DOM (L1 or L2) can't satisfy such requirements without a suitable API. Given such a parsing API, including controls over which of the various DOM options are desired in the output tree (entities and refs: just say no!), THEN one ought to expect isomorphic DOM trees to get produced by different implementations ... for DOM implementations that hook up to XML processors. But without such an API, it can't ever happen, and the answer to Mark's question is that he's expecting too much out of DOM (L1 or L2; HTML or XML). - Dave p.s. As you know, this has long been one of my pet issues: the DOM API is incomplete for what the marketplace perceives its role to be. ----- Original Message ----- From: "Lauren Wood" <lauren@sqwest.bc.ca> To: <www-dom@w3.org> Sent: Friday, 1 September, 2000 9:19 AM Subject: Re: Difference in DOM's > On 1 Sep 2000, at 11:28, Mark Brennan wrote: > > > Say we have two different implementations of the org.w3c.dom one being > > openXML's and another being some other package out there. Now when we parse > > an HTML file and produce a DOM out of it we are not assured that the two DOM > > representations will have the same structured tree because of differences in > > parsing and malformed html. Wasn't the whole idea of the DOM to keep things > > structured. shouldn't these two parsers create the exact same DOM? > > If the HTML is valid, you will get the same trees. Given the number > of ways HTML can be invalid, there's no way of specifying the DOM > to make all processors give you the same trees under any > circumstances. So you'll have to clean up the HTML first, e.g., by > running it through an HTML clean-up script. > > > Lauren >
Received on Friday, 1 September 2000 12:45:22 UTC