Re: coping with overlapping elements in the DOM

> One of the big problems in trying to come up with a reasonable 
> specification for the DOM is trying to figure out how much we
> should do to cope with broken HTML documents. Obviously
> seriously broken documents will cause so many problems
> that we just don't want to get into, but there are some 
> classes of common mistakes that we can maybe allow.

I would vote for indicating an error to the user/author.  Borken HTML should
be fixed in the first case.

My browser copes with broken HTML like this:

> One of these classes of mistakes is overlapping elements, 
> of the form
> <P><B>This is <EM> not </B> a good idea</EM></P>
                         ^
                         |

At this point, you have an end tag which does not match the current element,
but that element's start tag is on the stack (this is a stack-based parser).
Close all open elements until the corresponding element is closed.  Extraneous
close tags are ignored.  Hence my solution would be equivalent to:

<P><B>This is <EM> not </EM></B> a good idea</P>

This method has the advantage of not requiring the parser to look ahead.

To an author who is testing their document, hopefully this strategy would 
reveal
that there is a problem with the document (because something they thought 
would be italicised isn't).

Hope that helps,
Steve Ball

Received on Wednesday, 6 August 1997 19:48:13 UTC