Re: getElementsByTagName() and case-sensitivity for HTML documents

Thanks for all the replies to this query.

Tom and Bob: I do understand that XML documents are always
case-sensitive, but I'm interested in HTML documents here.  I said this
in my subject line, but forgot to mention it in the body of my message.
Sorry for the confusion.

David Brownell wrote:

> In an HTML dom, nothing is case sensitive.
> In an XML dom, everything is.
> 
> It says so somewhere, I forget where ... :)

The DOM Level 2 core document says this:

    1.1.7. String comparisons in the DOM The DOM has many interfaces that
	   imply string matching. HTML processors generally assume an
	   uppercase (less often, lowercase) normalization of names for such
	   things as elements [p.98] , while XML is explicitly case
	   sensitive. For the purposes of the DOM, string matching is
	   performed purely by binary comparison [p.99] of the 16-bit units
	   [p.97] of the DOMString [p.17] . In addition, the DOM assumes
	   that any case normalizations take place in the processor, before
	   the DOM structures are built.

But this isn't enough.  It just says that the processor normalizes all
HTML element names to be uppercase or lowercase.  This doesn't help me
know what I should pass to getElementsByTagName().  The description of
that method doesn't say that it normalizes tag names, and the there is
no way to find out if my processor normalizes to uppercase or lowercase,
so what do I do?  

In practice, I know that for HTML documents I can use tag names with any
capitalization and the getElementsByTagName() methods will work okay.
But I don't think the spec actually guarantees this anywhere.  (At least
it doesn't do it in the obvious place, which would be in the desrciption
of the method!)

   David  Flanagan

Received on Monday, 30 July 2001 14:03:53 UTC