W3C home > Mailing lists > Public > www-dom@w3.org > January to March 1998

HTML normalization question wrt DOM

From: David Mott <mott@nc.com>
Date: Thu, 8 Jan 1998 16:50:56 -0500 (EST)
Message-ID: <34B5494A.8503BA38@nc.com>
To: www-dom@w3.org
CC: "mott@nc.com" <mott@nc.com>, Alan Kaiser <alan@navio.com>

I have not seen any document describe a common way to normalize HTML
that is poorly formed w.r.t. the DOM. This seems important if all DHTML
clients are to respond to JavaScript in the same way.

For instance,

<p><b>one <i>two </b>three </i> four</p>

does not produce a valid DOM tree. I can see two ways of representing
   |           |           |
  <b>         <i>        four
   |           |
 -----       three
 |   |
one <i>

This gives proper style inheritance, but JavaScript access to <i> will
not be correct unless <i> remembers it is in multiple parts of the tree.

On the other hand:

   |                       |
  <b>                    four
 |       |
one     <i>
  |      |      |
 two    </b>  three

Gives proper style inheritance and proper JavaScript access, but results
in nodes under <b> that aren't really bold, and introduces end-tags to
the hierarchy, as well as bounding box calculation complexities.

Is this a question for the DOM working group? Do all clients need to
build a normalized DOM tree the same way? Or should clients do whatever
they think makes most sense, as long as the JavaScript behavior is the
same? Thst is, getting the inner/outer text works as expected, changing
the text color works as expected, etc.


David Mott, Network Computer Inc.
mott@nc.com    http://www.nc.com
Received on Tuesday, 13 January 1998 14:55:01 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 20 October 2015 10:46:03 UTC