Re: Doc with no space in DTD considered valid

On Tue, 6 Apr 2004, Jamie Norrish wrote:

> Perhaps so, but surely the validator should not say that the document
> is valid, even if it (acceptably) does not report the error?

Validity means lack of reportable markup errors.

> How can a
> document be determined to be valid if it cannot be checked against a
> DTD?

It cannot, by definition. But an error in a document type declaration need
not prevent the validator from finding the DTD.

> With the separator between the two identifiers, the document is
> reported as being valid HTML 4.01 strict. Without the separator, the
> document is simply valid - what does that mean?

The validator issues a message of the form
This Page Is Valid ... Strict!
where ... is either the formal public identifier or something that the
validator finds in some internal table. Try this:

<!DOCTYPE HTML PUBLIC "Nonsense"
 "http://www.w3.org/TR/html4/strict.dtd">

and you get a good laugh:

This page is not Valid Nonsense!

You could also try

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 1.2//EN"
 "http://www.w3.org/TR/html4/strict.dtd">

I'm using wrong DOCTYPEs, but so are _many_ popular authoring tools.

Anyway, when the validator is unsuccessful in its table lookup, it just
emits the FPI, and here it apparently fails since it performs the lookup
in a manner that is not based on the (error-correcting) parsing of the
doctype declaration but some more simplistic string matching.

It would probably be best if the validator just did its job of reporting
whether a document is valid or not and listing errors if it isn't.
Heuristics in trying to deduce a name for the document type seem to
produce misleading information at times. The person who uses a validator
should know which DTD he is validating against - how could he understand
and correct the errors otherwise?

-- 
Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/

Received on Tuesday, 6 April 2004 04:38:40 UTC