W3C home > Mailing lists > Public > www-dom@w3.org > January to March 1998

HTML normalization question wrt DOM

From: David Mott <mott@nc.com>
Date: Thu, 8 Jan 1998 16:50:56 -0500 (EST)
Message-ID: <34B5494A.8503BA38@nc.com>
To: www-dom@w3.org
CC: "mott@nc.com" <mott@nc.com>, Alan Kaiser <alan@navio.com>
Vidur,

I have not seen any document describe a common way to normalize HTML
that is poorly formed w.r.t. the DOM. This seems important if all DHTML
clients are to respond to JavaScript in the same way.

For instance,

<p><b>one <i>two </b>three </i> four</p>

does not produce a valid DOM tree. I can see two ways of representing
this:
              <p>
               |
   -------------------------
   |           |           |
  <b>         <i>        four
   |           |
 -----       three
 |   |
one <i>
     |
    two


This gives proper style inheritance, but JavaScript access to <i> will
not be correct unless <i> remembers it is in multiple parts of the tree.

On the other hand:

              <p>
               |
   -------------------------
   |                       |
  <b>                    four
   |            
 ---------       
 |       |
one     <i>
         |
  ---------------
  |      |      |
 two    </b>  three


Gives proper style inheritance and proper JavaScript access, but results
in nodes under <b> that aren't really bold, and introduces end-tags to
the hierarchy, as well as bounding box calculation complexities.

Is this a question for the DOM working group? Do all clients need to
build a normalized DOM tree the same way? Or should clients do whatever
they think makes most sense, as long as the JavaScript behavior is the
same? Thst is, getting the inner/outer text works as expected, changing
the text color works as expected, etc.

David

-- 
David Mott, Network Computer Inc.
mott@nc.com    http://www.nc.com
Received on Tuesday, 13 January 1998 14:55:01 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 22 June 2012 06:13:44 GMT