Re: XHTML Considered Harmful

On Sun, 24 Jun 2001, Ian Hickson (that's me) wrote:
> On Mon, 25 Jun 2001, Arjun Ray wrote:
>>
>> You're forgetting the conformance requirements.
> [snip some ignorant comments]

Actually, I take that back. The XHTML1 Conformance Requirements are
pathetic and IMHO inappropriate.

# 3.2 User Agent Conformance
#
# A conforming user agent must meet all of the following criteria:
#
# 1. In order to be consistent with the XML 1.0 Recommendation [XML],
# the user agent must parse and evaluate an XHTML document for
# well-formedness. If the user agent claims to be a validating user
# agent, it must also validate documents against their referenced DTDs
# according to [XML].

Fair enough.

# 2. When the user agent claims to support facilities defined within
# this specification or required by this specification through
# normative reference, it must do so in ways consistent with the
# facilities' definition.

Sensible.

# 3. When a user agent processes an XHTML document as generic XML, it
# shall only recognize attributes of type ID (e.g. the id attribute on
# most XHTML elements) as fragment identifiers.

Seems logical.

# 4. If a user agent encounters an element it does not recognize, it
# must render the element's content.

Woah there buster. XHTML should not mention rendering rules. If CSS,
to pick an example almost at random, said "display: none", then the
unknown element had better not be rendered.


# 5. If a user agent encounters an attribute it does not recognize, it
# must ignore the entire attribute specification (i.e., the attribute
# and its value).

That's ok (what else would you do?).


# 6. If a user agent encounters an attribute value it doesn't
# recognize, it must use the default attribute value.

Fair enough. (If you get here you are non-validating, and anyway,
there are very few cases where this causes a problem since most
attributes in XHTML1 Strict are CDATA attributes.)


# 7. If it encounters an entity reference (other than one of the
# predefined entities) for which the User Agent has processed no
# declaration (which could happen if the declaration is in the
# external subset which the User Agent hasn't read), the entity
# reference should be rendered as the characters (starting with the
# ampersand and ending with the semi-colon) that make up the entity
# reference.

Hmm. That should be left up to either XML or the rendering specs.
XHTML is not the right protocol level to determine this.


# 8. When rendering content, User Agents that encounter characters or
# character entity references that are recognized but not renderable
# should display the document in such a way that it is obvious to the
# user that normal rendering has not taken place.

Again, rendering rules should not be discussed by a vocabulary spec.


# 9. [...] In elements where the 'xml:space' attribute is set to
# 'preserve', the user agent must leave all whitespace characters
# intact (with the exception of leading and trailing whitespace
# characters, which should be removed). Otherwise, whitespace is
# handled according to the following rules:
#    * All whitespace surrounding block elements should be removed.

This clashes with CSS' rendering rules and should be out of scope for
a vocabulary spec.


#    * Comments are removed entirely and do not affect whitespace
# handling. One whitespace character on either side of a comment is
# treated as two white space characters.

Comments had better not be removed -- what about the DOM?


#    * Leading and trailing whitespace inside a block element must be
# removed.

That should be up to the DOM and the rendering spec.


#    * Line feed characters within a block element must be converted
# into a space (except when the 'xml:space' attribute is set to 'preserve').

That should be up to the DOM and the rendering spec or the XML spec.


#    * A sequence of white space characters must be reduced to a
# single space character (except when the 'xml:space' attribute is set
# to 'preserve').

This clashes with CSS' rendering rules and should be out of scope for
a vocabulary spec.


#    * With regard to rendition, the User Agent should render the
# content in a manner appropriate to the language in which the content
# is written.

This clashes with CSS' rendering rules and should be out of scope for
a vocabulary spec.


So yes, I actually agree. The conformance part of XHTML and the
compatability part of XHTML are harmful and wrong. But ignoring
that... ;-)

-- 
Ian Hickson                                            )\     _. - ._.)   fL
Invited Expert, CSS Working Group                     /. `- '  (  `--'
The views expressed in this message are strictly      `- , ) -  > ) \
personal and not those of Netscape or Mozilla. ________ (.' \) (.' -' ______

Received on Monday, 25 June 2001 01:55:42 UTC