Re: HTML normalization question wrt DOM

Actually, the rule is that it must be a valid HTML, which is always properly
nested. If it isn't I believe it errors (obviously since it's very clear
that XML must nest correctly, there's not problem there). If the text
doesn't nest right, the browser is at fault for interpretting it
incorrectly. Netscape and Microsoft don't worry about nesting order, but the
spec does.

David Mott wrote:

> Vidur,
>
> I have not seen any document describe a common way to normalize HTML
> that is poorly formed w.r.t. the DOM. This seems important if all DHTML
> clients are to respond to JavaScript in the same way.
>
> For instance,
>
> <p><b>one <i>two </b>three </i> four</p>
>
> does not produce a valid DOM tree. I can see two ways of representing
> this:
>               <p>
>                |
>    -------------------------
>    |           |           |
>   <b>         <i>        four
>    |           |
>  -----       three
>  |   |
> one <i>
>      |
>     two
>
> This gives proper style inheritance, but JavaScript access to <i> will
> not be correct unless <i> remembers it is in multiple parts of the tree.
>
> On the other hand:
>
>               <p>
>                |
>    -------------------------
>    |                       |
>   <b>                    four
>    |
>  ---------
>  |       |
> one     <i>
>          |
>   ---------------
>   |      |      |
>  two    </b>  three
>
> Gives proper style inheritance and proper JavaScript access, but results
> in nodes under <b> that aren't really bold, and introduces end-tags to
> the hierarchy, as well as bounding box calculation complexities.
>
> Is this a question for the DOM working group? Do all clients need to
> build a normalized DOM tree the same way? Or should clients do whatever
> they think makes most sense, as long as the JavaScript behavior is the
> same? Thst is, getting the inner/outer text works as expected, changing
> the text color works as expected, etc.
>
> David
>
> --
> David Mott, Network Computer Inc.
> mott@nc.com    http://www.nc.com

Received on Tuesday, 13 January 1998 15:38:29 UTC