Re: DOM idiosyncracies and made-up tag names from Philip Taylor on 2009-02-19 (public-html@w3.org from February 2009)

From: Philip Taylor <pjt47@cam.ac.uk>
Date: Thu, 19 Feb 2009 15:05:36 +0000
To: "Dailey, David P." <david.dailey@sru.edu>
CC: public-html@w3.org
Message-ID: <499D7540.6090004@cam.ac.uk>

Dailey, David P. wrote:
> [...] 
> 
> I cobbled together the little experiment [2] with code sort of
> like this:
> 
> <mySelect>
> <myO>One</myO>
> <myO>Two</myO>
> <myO>Three</myO>
> <myO>Four</myO>
> </mySelect>
> 
>  All browsers seem to be able to "correctly" find var
> MS=document.getElementsByTagName("mySelect").
> 
> All but IE(7) seem to be able to find the four <myO>'s inside the
> <mySelect>.

That's because IE parses unknown tags as empty elements. Your markup is 
parsed into a "MYSELECT" element with no children, followed by a sibling 
element named "MYO", then a sibling text node "One", then a sibling 
element named "/MYO" (the slash is part of the element name), etc.

Compare the 'DOM view' of 
http://software.hixie.ch/utilities/js/live-dom-viewer/saved/14 in IE vs 
any other browser.

(The exception to this is that if you write 
<script>createElement('mySelect')</script> before you use <mySelect>, 
then IE will parse it like a proper element with nested children, which 
is closer to the behaviour of other browsers. The other exception is 
that if you write <body xmlns:foo> before you use <foo:whatever>, then 
that's also parsed like a proper element with nested children. And then 
there's a few weirder exceptions.)

> All the others seem to agree (on odd numbers, however) for the number of
> children inside the artificial tag:  9.

Why do you consider that odd (other than in the non-zero-modulo-two 
sense of the word)? The <mySelect> contains four element children, plus 
five whitespace text children surrounding the <myO>s.

> It is reassuring to me to see that the enumeration of objects in the
> document by the other browsers for [1] however still remains wildly
> divergent. Opera finds 134; IE 94; FF 148;  Chrome and Safari refuse
> even to sit at that particular table.
> 
> So I guess I might now repose the original question: is any of this an
> issue or not?

HTML5 currently defines precisely how to convert any sequence of 
characters into a DOM, including the handling of whitespace (even in odd 
locations like after the </html>). The parser test cases (currently at 
<http://code.google.com/p/html5lib/source/browse/trunk/testdata/tree-construction/>) 
should be testing all of those details. The idea is that UAs will 
eventually implement the parsing algorithm and pass all the tests, and 
then the differences you see between browsers should go away. So I don't 
see any issues that aren't already addressed.

> Cheers
> 
> David

-- 
Philip Taylor
pjt47@cam.ac.uk

Received on Thursday, 19 February 2009 15:06:11 UTC