Re: W3C Validator vs Schneegans

Frank Ellermann wrote:

> I was only a bit surprised by the error message for
> <> - it found an "invalid"
> <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> for the 302.

Well, the response body is not a well-formed XML document.

>>> insists on UTF-8 for all documents without XML-encoding
>> Absolutely not, it complies with <>
>> for "text/html"
> Then something with my page (see above) is not as you expect
> it,

There's no "charset" parameter in the "Content-Type" header, no BOM
and no XML declaration with an "encoding" pseudo-attribute. Therefore,
the default encoding of XML is assumed, which is UTF-8. The "meta"
element is ignored. This is fully intentional and complies with
<>. I know that the W3C validator
accepts an encoding declaration in a "meta" element for XHTML
documents served as "text/html", and I consider this a bug.

> <!ELEMENT fieldset ( legend, (#PCDATA | %block; | form | %inline; | %misc;)*)>

This is not even well-formed XML. See
<>. Only
<> can produce '#PCDATA'.

> As "validator-fan" for years I'm a bit angry if it's attacked
> only because the XHTML DTDs are sloppy (?).

That's not the case. The validator is attacked mainly because of two

- It tells its users that it can check "HTML and XHTML (documents) for
conformance to W3C Recommendations and other standards". This is at
least misleading. Several suggestions were made in

- It is unable to check XML well-formedness, but doesn't admit this.
Instead, it uses the euphemism "some limitations". Web browsers today
are able to find (almost all) well-formedness errors, the validator
isn't! Furthermore, it refers to
<>. What do you think how
many users know what "parameter separators" and "parameter literals"
are? You should also see the "translation" in

<>                                              |

Received on Monday, 5 September 2005 14:33:33 UTC