XHTML Media Types

Hello all,

while reviewing the updated <http://www.w3.org/TR/xhtml-media-types/>,
I noticed the following errors and omissions:

| A.2. Elements that can never have content
| [...]
| Rationale: HTML user agents ignore the  /> at the end of a tag, [...]

I am afraid that statement, in its absoluteness, is plain wrong.  (It was
also in XHTML 1.0, Appendix C, and is addressed in my criticism of that
section to define a variant of XHTML 1.0 that would be "HTML-compatible".
Because, strictly speaking, HTML and XHTML are *incompatible*.  But I digress.)

It is merely correct to say that some HTML UAs, maybe many of them, ignore
the `/'.  It is incorrect to say that "HTML UAs" (implies in that context:
all) do, because there are standard-conforming HTML UAs that treat `/>' like
`>&gt;'; most notably, the current version of the W3C Markup Validator, but
there are also practical examples:

<http://dodabo.de/html+css/tests/shorttag.html>

You will notice that these include Links and lynx, which are known to
provide input for screen readers and Braille lines (for people with impaired
vision, and blind people), and thus cannot be reasonably discounted.  So it
seems only fair to mention in the guideline, at least, that serving XHTML as
text/html as suggested can, at least, create accessibility issues.

| A.4. Embedded Style Sheets and Scripts
|
|   DO use external style sheets if your style sheet uses < or & or ]]> or
|   --. DO NOT use an internal stylesheet if the style rules contain any of
|   the above characters.
|
|   DO use external scripts if your script uses < or & or ]]> or --. DO NOT
|   embed a script in a document if it contains any of these characters

In my opinion, mentioning `--' like this is misleading.  AFAICS, it will do
no harm to a stylesheet or a script unless it is preceded by `<!', and the
corresponding section has not been marked up as CDATA.  This should be
clarified.

In addition, it should be pointed out that

  <script ...><!--
    ...
  --></script>

(including cases with white-space before `<!..', or after `-->', or `//'
before `<!--') is considered harmful as it will comment out the `script'
element's content in XHTML parsed by an XML UA (content model: PCDATA), and
at least the last line will lead to script syntax errors (invalid operand
for pre-decrement) in XHTML parsed by an HTML UA (content model: CDATA).

HTML 2.0 (RFC 1866), which did not yet have the SCRIPT element, is OBSOLETE
since about 9 years (RFC 2854, 2000-06 CE) now, so UAs still requiring this
should be considered borken.  Pre-HTML-3.2/4.01 (Browser War I era) UAs are
pretty much obsolete by now, too.  (That said, there was never a real need
to hide script code in the HEAD element, which is where most people appear
to use it.)

So developers should be advised that the pseudo-CDO and -CDC should simply
be omitted from scripts.  (HTML 4.01 [1999-12-24] did not make that clear,
nor did XHTML 1.0, Second Edition [2002-08-01], Appendix C.  It is [thus?] a
frequent issue in [de.]comp.lang.javascript where I am a regular contributor.)

| A.6. Deleted
|
| This guideline was deleted because it is no longer relevant.

I suggest deleting this subsection as it is more confusing than helpful.
If it was relevant before, one should find a note of its deletion in the
document history.  One would have to look up a previous version of the WG
Note in any case.

| A.11. Document Object Model and XHTML
| [...]
| For example, in JavaScript you might do something like:
|
|    ...
|    var name=node.name().toLowerCase;
|    if ( name == 'table' ) {
|       ...
|    }

MUST be (to work, and make sense):

  var name=node.nodeName.toLowerCase();
  // ...

SHOULD be at least:

  var elName = node.nodeName.toLowerCase();
  if (elName === "table") {
    ...
  }

(`name' is an identifier unwise to choose.  For example, Window objects
already have a `name' property that might be shadowed with this, and `name'
could as well indicate the name of HTML DOM object representing a form control.

Changes in quotes only because I think double-quotes are better legible, and
apostrophes, aka single-quotes, should only be used when necessary.)

However, since HTML user agents are supposedly still in the majority, and
that I know of yield an uppercase node name, the following would be more
efficient overall:

  var elName = node.nodeName.toUpperCase();
  if (elName === "TABLE") {
    ...
  }

or

  if (/^TABLE$/i.test(node.nodeName) {
    ...
  }

as Regular Expressions and RegExp initializers are specified since
ECMAScript Edition 3, and supported since JavaScript 1.2, JScript
3.1.3510 (supposedly 3.0), whereas both predate it, and at least
Opera 5.02.  (See also: <http://PointedEars.de/es-matrix>)

| A.21. document.write
| [...]
| Rationale: Native XML user agents may not support this technique for
| modifying the content of the document.

You are correct, for example Netscape/Mozilla Gecko does not support
document.write() in XML mode.  I cannot remember of any user agent that
does.  However, it would appear that this is either a bug in those user
agents, or W3C DOM Level 2 HTML is incorrect/imprecise.  For the latter
says that

| 1.1. Introduction
|
| This section extends the DOM Level 2 Core API [DOM Level 2 Core] to
| describe objects and methods specific to HTML documents [HTML 4.01],
| and XHTML documents [XHTML 1.0].

but it defines HTMLDocument::write(), which is implemented as document.write().

<http://www.w3.org/TR/2003/REC-DOM-Level-2-HTML-20030109/html.html#ID-75233634>

Maybe you could clarify this (in cooperation with the DOM Working Group)?

As for the rest, thank you very much for the update; you have pointed out
some important issues regarding compatibility that were not in the first
edition, and that will help us in the future.

I would suggest you consider removing Appendix C from XHTML 1.0 in the next
revision, and to refer to the WG Note instead.

I am looking forward to your reply.  Thank you in advance.


Kind regards,

PointedEars

Received on Saturday, 20 June 2009 10:38:19 UTC