- From: Ray Whitmer <RAY@corel.com>
- Date: Wed, 06 Aug 1997 10:29:28 -0600
- To: www-dom@w3.org
- Cc: HANKD@corel.com, RODS@corel.com, VERNON@corel.com
It is not clear to me what your option 1 would do -- whether it continues parsing as though no error occurred or aborts, and if it continues, whether the result is made good by ignoring the bad information, or if the resulting DOM is badly-formed. I am strongly against anything that produces a poorly-formed (overlapping) object model in DOM. FWIW, the example fix-up did not change the parent. It eliminated one parent in a case where a tag had multiple parents (one at start, and one at the end). I am also against deprecating any fixup layer, which just increases the unpredictability. The alternatives I see are as follows: 1. Strip the bad tag out entirely and by wiping out the tags in question sending a strong message that bad HTML will not be tolerated. But much HTML may be outside of the control of the one using it. 2. Completely reject the entire HTML once an error was discovered, again sending a strong message. A seperate optional utility should be available to fix up broken HTML. 3. Allow the implementation to incorporate the nice fixup capabilities. I think 3 is good, and does not encourage the creation of broken HTML. A DOM- or DTD-based HTML/XML editer should never save out broken HTML, so someone working in that environment should never have a problem. HTML from other sources is generally outside of the control of the one using the DOM, so it will have to be fixed up at some point, and the fixup should be as painless as possible. Ray Whitmer ray@corel.com >>> Lauren Wood <lauren@sqwest.bc.ca> 08/05/97 04:45pm >>> One of the big problems in trying to come up with a reasonable specification for the DOM is trying to figure out how much we should do to cope with broken HTML documents. Obviously seriously broken documents will cause so many problems that we just don't want to get into, but there are some classes of common mistakes that we can maybe allow. One of these classes of mistakes is overlapping elements, of the form <P><B>This is <EM> not </B> a good idea</EM></P> We are thinking of defining nodes that would effectively change the above example into <P><B>This is <EM> not </EM></B><EM> a good idea</EM></P> This does have effects on style sheets and other operations that refer to the parent element, since the first EM element has a different parent in the two examples. Since we don't really want to encourage people to write broken documents, there is also the problem of whether we should do anything for overlapping elements at all. The choices are: 1) don't do anything for overlapping elements 2) do something and deprecate it immediately, so it will be in level one but not level two 3) put it in without deprecating. The DOM WG would like feedback on this issue. Which option do you think the best? thanks, Lauren
Received on Wednesday, 6 August 1997 17:53:40 UTC