DOM2 HTML: HTML WG Last Call Remarks

Steven Pemberton, for the HTML WG

Please note that this is a group response, and not a personal one, so please address all replies to the group,


In general the HTML WG is unhappy with the idea of a special DOM for XHTML. It would rather use generic XML mechanisms wherever possible.

The working group recognises the advantages of a DOM for a markup language, in that it offers strong type checking for the structures being manipulated, and recognises the desirability of some occasional extra DOM functionality than the pure XML DOM, for instance for manipulating computed values. However, the group has striven from the beginning to utilise generic XML technologies as much as possible, and would prefer to see the W3C moving towards more generic solutions.

To this end, we would like to see the text altered to make it clear that the DOM2 HTML only applies to HTML 4 and XHTML 1.0. For instance in 1.6.3 it mentions "XHTML 1.0 or above" -- this should just read "XHTML 1.0"; in 1.6.1 it mentions "future versions of XHTML". Another example is possibly in 1.5, for method 'Open' of interface HTMLDocument, where it says "The following methods may be deprecated at some point", which suggests future versions.

XHTML is an extensible family of languages, and so we find it difficult to see how the DOM2 HTML can be applied to future versions of XHTML without using a different approach. In particular, most of the HTML DOM is just the schema (or DTD) written in a different way. We would encourage work to investigate somehow linking the schema and the DOM so that convenience functions can just be inferred from the schema.


We would like some text explaining the relationship between the use of the DOM and the relevant DTD for the document in question, and what the processing consequences are when generating elements that are not valid for the current document. In particular we would like to see some explanation of "The text is parsed into the document's structure model" in HTMLDocument.write and writeln.

Technical issues

XHTML 1.0 has 3 DTDs too: section 1.1 seems to suggest otherwise ("the XHTML 1.0 DTD").

Please refer to HTML 4 (as a generic) or HTML 4.01 (as a particular); HTML 4.0 has been superceded by HTML 4.01. Please use the HTML 4.01 recommendation as the reference.

Mixture of semantics: name and id. The 'name' attribute has zero semantics in XHTML. So HTMLCollection.namedItem should only search for id attributes in XHTML, and ignore 'name' attributes. For XHTML, HTMLDocument.getElementsByName should only return form controls with matching name.

Doubtful Convenience

We are not convinced that there is any convenience in certain methods:

HTMLDocument.anchors: all elements with an id are anchors in HTML 4 and XHTML; what is the convenience of only returning the <a> elements? Furthermore, since the name attribute has no semantics in XHTML, the returned set should always be empty for XHTML documents.

Since <object> is the recommended method for including images in a document, what is the convenience of HTMLDocument.images only returning <img> elements?

Textual issue

1.6.2 suggests that there is some general naming technique applied, and yet it seems only to apply to htmlFor, and not, for instance, to Element.className, which according to 1.6.2 should be called htmlClass.