Re: HTML WG Last Call Remarks to DOM 2 HTML

On Fri, 2002-01-18 at 10:59, Steven Pemberton wrote:
> General
> In general the HTML WG is unhappy with the idea of a special DOM for
> XHTML. It would rather use generic XML mechanisms wherever possible.
>
> The working group recognises the advantages of a DOM for a markup
> language, in that it offers strong type checking for the structures
> being manipulated, and recognises the desirability of some occasional
> extra DOM functionality than the pure XML DOM, for instance for
> manipulating computed values. However, the group has striven from the
> beginning to utilise generic XML technologies as much as possible, and
> would prefer to see the W3C moving towards more generic solutions.

XHTML 1.0 does not always use pure XML technologies such as XLink,
XForms, or XML Base. Unfortunately, this forces us to include some
"special" cases in the DOM API, including the DOM Level 3 Core
itself. The DOM Working Group agrees that the future of a DOM API for
XHTML is not inside the DOM Level 2 HTML API. On the other hand, XHTML
1.0 is still close enough to HTML 4.01. In order to facilitate the
transition between HTML 4.01 and the XHTML world, users should to be
able to address both documents using the same API. Also, some of the
functionalities proposed by the DOM Level 2 are not accessible through
the DOM Core API, due to their HTML/XHTML specificity.  Examples are the
methods HTMLFormElement.submit(), or HTMLAnchorElement.focus(). Exposing
the XHTML document through the DOM Core does not give access to the
dynamic information such as HTMLInputElement.value either. We hope that
W3C Working Groups will base their work on technologies such as XLink
and/or XForms in the future. It will certainly facilitate the XML
integration in the end-users applications.

> To this end, we would like to see the text altered to make it clear
> that the DOM2 HTML only applies to HTML 4 and XHTML 1.0. For instance
> in 1.6.3 it mentions "XHTML 1.0 or above" -- this should just read
> "XHTML 1.0"; in 1.6.1 it mentions "future versions of XHTML". Another
> example is possibly in 1.5, for method '|Open|' of interface
> |HTMLDocument|, where it says "The following methods may be deprecated
> at some point", which suggests future versions.

Changes have been made to make clear that we are only base on HTML 4.01
and XHTML 1.0 in sections 1.5, 1.6.1, and 1.6.3.

> XHTML is an extensible family of languages, and so we find it
> difficult to see how the DOM2 HTML can be applied to future versions
> of XHTML without using a different approach. In particular, most of
> the HTML DOM is just the schema (or DTD) written in a different
> way. We would encourage work to investigate somehow linking the schema
> and the DOM so that convenience functions can just be inferred from
> the schema.

The XML Schema Working Group requested in August 2000 the DOM to be
extensible in order to expose the PSVI. There are discussion between the
XML Schema and DOM WGs to address this issue. A significant number of
W3C Members would like this work to happen but it is not clear if they
are willing to provide enough resources to do it unfortunately. It is
still an open issue within the DOM WG. Being able to infer convenience
functions from the XML Schema is a more advanced topic. As of today, the
WG is not planning to work on it and it is not clear if a technical
solution is possible and in the scope of the W3C.

> Clarifications
> We would like some text explaining the relationship between the use of
> the DOM and the relevant DTD for the document in question, and what
> the processing consequences are when generating elements that are not
> valid for the current document. In particular we would like to see
> some explanation of "The text is parsed into the document's structure
> model" in |HTMLDocument.write| and |writeln|.

The relations between the DTD of the document and its DOM representation
are not defined in the DOM API. In other world, neither the DOM Core API
or the DOM HTML API are defined to be validating API and therefore they
permit any kind of child insertion at any point. Validation is in the
scope of the DOM Level 3 Abstract Schemas specification. The sentence
"The text is parsed into the document's structure model" has been
removed from the document. We will encourage users to use the DOM Level
3 Load and Save API in the future.

> Technical issues
> XHTML 1.0 has 3 DTDs too: section 1.1 seems to suggest otherwise ("the
> XHTML 1.0 DTD").

Fixed.
 
> Please refer to HTML 4 (as a generic) or HTML 4.01 (as a particular);
> HTML 4.0 has been superceded by HTML 4.01. Please use the HTML 4.01
> recommendation <http://www.w3.org/TR/html401/> as the reference.

Fixed, we are now using HTML 4.01.
 
> Mixture of semantics: |name| and |id|. The '|name|' attribute has zero
> semantics in XHTML. So |HTMLCollection.namedItem| should /only/ search
> for |id| attributes in XHTML, and ignore '|name|' attributes. For
> XHTML, |HTMLDocument.getElementsByName| should only return form
> controls with matching |name|.

Fixed. We added wording to mark the difference between HTML and XHTML
for those methods.
 
> Doubtful Convenience
> We are not convinced that there is any convenience in certain methods:
> |HTMLDocument.anchors|: /all/ elements with an |id| are anchors in
> HTML 4 and XHTML; what is the convenience of only returning the |<a>|
> elements? Furthermore, since the |name| attribute has no semantics in
> XHTML, the returned set should /always/ be empty for XHTML documents.

Even if the name attribute has no semantic in XHTML 1.0, it is still
part of this language. HTMLDocuments.anchors is kept in the DOM Level 2
HTML API for backward compatibility reason. We added the following
sentence:

"The attribute name was deprecated in XHTML 1.0, therefore it is
recommend to not use this attribute for XHTML 1.0 documents. Users
should prefer the iterator mechanisms provided by DOM Level 2 Traversal
and Range instead."

> Since |<object>| is the recommended method for including images in a
> document, what is the convenience of |HTMLDocument.images| only
> returning |<img>| elements?

The reason is backward compatiblity since this attribute is supported by
both IE3.0 and NS3.0. However, since as the HTML WG mentioned, we was
added to the description:

"Note: As suggested by [HTML4.0], to include images, authors may use the
OBJECT element or the IMG element. Therefore, it is recommended not to
use this attribute to find the images in the document but
getElementsByTagName with HTML 4.0 or getElementsByTagNameNS with XHTML
1.0."
 
> Textual issue
> 1.6.2 suggests that there is some general naming technique applied,
> and yet it seems only to apply to |htmlFor|, and not, for instance, to
> |Element.className|, which according to 1.6.2 should be called
> |h||tmlClass|.

Again, for backward compatibility with IE3 and NS3, the name cannot be
renamed.


Regards,
Philippe,
for the DOM WG.

Received on Monday, 1 April 2002 15:10:02 UTC