Re: XHTML validation, CVS notifications (was Re: Validator errors ) from Bless Terje on 2000-02-02 (www-validator@w3.org from February 2000)

From: Bless Terje <link@rito.no>
Date: Wed, 2 Feb 2000 17:01:39 +0100
To: W3C Validator <www-validator@w3.org>
Message-ID: <22FD5BD2DBC5D211BE0D0008C7A4E87FD9B434@odin4.rito.no>
Gerald Oskoboiny <gerald@w3.org> wrote:

>  - if the content type is text/html and there's no doctype, then
>  
>    - if there's an xmlns attribute on the first element,
>      do XML well-formedness checking (as it does now for
>      any text/html docs without doctypes)
>    
>    - if there's no xmlns attribute on the first element,
>      assume HTML 4 transitional (and whine about the missing
>      doctype)

How about if we make a DOCTYPE required; not for purposes of validity, but
because we can't reliably validate the document if it doesn't have one?

You could even have an option to "Guess DOCTYPE" which would apply the
current
heuristics and attempt to validate that, but would never pronounce a
document
to be valid unless it actually contains a DOCTYPE declaration. This would
allow
users to check the validity of any document, without providing defaults that
are bound to be wrong some of the time, and without allowing user error to
label an invalid X document as a valid Y document.

In pseudo-code:

  if ($guessed_doctype) {
    print "No errors detected, but the DOCTYPE is missing.\n";
    print "Without a DOCTYPE we cannot pronounce the document as valid.\n";
    print "<Instructions for selecting and adding a DOCTYPE>";
  }


In general, if you have selected some default behaviour like this, you must
continue to behave that way to remain backwards compatible. The only change
that will not confuse users is to stop using such default behaviour at all
and inform them of why it was necessary. As I've said a few times now, when
the default changed from HTML to XHTML *I* was confused and I have intimate
knowledge of how that code works. From a user perspective it would be very
hard to figure out what happened.

If, OTOH, the new behaviour had reported "no DOCTYPE" and then offered to
"retry the validation as <X>" (where X is a member of the list comprising
all known DOCTYPEs, or common DOCTYPEs, or those DOCTYPEs we wish to
promote),
the course of action would be clear for users. If it said "This is a recent
change. Sorry for the inconvenience. Here's how to fix it..." I think most
people would understand, both why it was done and how to fix it.

Just spewing a gazillion error messages on a document that used to validate
just fine is very very bad.


As regards the new heuristics above: every time you add a new special case
you dig yourself deeper. There are *allways* more special cases and
exceptions!
IMO the only sane thing to do here is to apply "Slippery Slope" logic;
don't make _any_ special cases, but explain how to get around it. In
this case, the "Guess DOCTYPE" option would be one option in a list of
DOCTYPES and a part of the DOCTYPE-override feature and *not* a painfully
convoluted set of exceptions.


>I meant to send mail to this list after making the most recent changes, but
>I finished them at 3am after an all-nighter the night before, and I didn't
>feel like doing anything besides sleep at that point. :)

Ah! I sympathize; completely! :-|

OTOH, that is one good reason to set up automated notifications.


>I set up the CVS notifications just now; future commits will cause a
message
>to [be] sent to this list.

Great! Thanks!


>>It might also be a good idea to committ to CVS, and run a test server,
>>_before_ making the code live on validator.w3.org. That way the peanut
>>gallery can get their two cents in before going live (and occationally
>>the peanut gallery has a point ;D).
>
>I have a test server [at] http://validator.w3.org:8000/ [...] but in this
>case we wanted the validator to be ready for the XHTML REC press release
>last Wed, and I didn't have it ready earlier so there wasn't much time
>for feedback. :(

That reminds me...

...whatever happened to the File Upload support (TODO #9)?

BTW, TODO #19 is done and TODO #21 should be relegated to weblint IMO.
Received on Wednesday, 2 February 2000 11:01:28 UTC