Re: Error in validator?

On 09.07.01 at 12:23, Eric Meyer <> wrote:

>    You've almost made my point: the quoted text says quite clearly that 
>plain SGML validation is not enough, and that a specialized validation 
>is needed to fully handle HTML.  That could easily be read to say that 
>HTML is not exactly SGML, and so any place where HTML diverges from 
>standard SGML practice should be handled by an HTML validator.  After 
>all, it's called the "HTML Validator," not the "SGML Validator using a 
>sort-of-HTML DTD."

Actually, you are falling victim to the same "popularization" that you
advocate here. :-)

The "W3C HTML Validator" is really the "W3C SGML Validator"; it just
happens to be made mostly for validating SGML documents in the HTML family
(or, possibly, genus). What Björn meant when he said "can't be done" is
that the W3C HTML Validator can't actually check conformance to the HTML
Reccommendation _because_ it's an SGML validator and not an HTML Validator.
The HTML Rec. contains many constraints that are not possible to express in
SGML, this beeing one of them.

The reason it's name contains "HTML" and not "SGML" is that your average
webduhsigner has never heard of "SGML", but "HTML" is familiar. If they
don't know SGML they don't read the DTD; or, to twist it into context, if
they don't read the DTD it's a safe bet that they don't know SGML.

In the future, XML Schema Language is the solution to this problem. It's a
formal vocabulary (and system?) for doing more advanced validation of
documents then you can with DTD-based validation[0].

In the mean time -- that is, until the W3C starts issuing a Schema along
with the DTD for new Reccommendations! -- you get SGML/DTD-based Validation
and an option to use various "lints" such as Weblint or Tidy.

The Validator used to have an iption to use Weblint, but it fell too far
behind the times and so was removed. Tidy is picking up the pace lately,
but we're waiting on a specific featureset to be implemented before
evaluating inclusion in the Validator. It's on the TODO tho´.

But the main reason the Validator doesn't do this kind of checking is that
it would by definition be arbitrary limits imposed to the good or bad taste
of the implementor. The only means available to formally validate documents
in the HTML family is using an SGML Parser and a DTD. The prose in the HTML
reccomendation is not machine parseable! We /could/ write checks for this,
but it would be subject to interpretation by whoever implemented them.

[0] - Or so I'm told; I wouldn't know a Schema from a Klingon. :-)

Received on Saturday, 14 July 2001 23:49:56 UTC