Re: [ANN] W3C Markup Validator 0.6.5 Beta #1

On Fri, 29 Aug 2003, Terje Bless wrote:

> However, it does not follow from the observed facts that «[The
> W3C Markup Validator] does not even try to be a validator any more.»

It does. You just admitted that it incorrectly claims that a valid
document is not valid, and this was apparently an intentional change.

> Anyone may challenge the details of the implementation — and there are areas
> in which we, in minor ways, diverge from the proper behaviour from a SGML POV
> — but to the best of my knowledge there are no areas where it substantially
> diverges from its goal.

There's no need to discuss the details. Either it is a validator, or it is
not. You're seriously messing things up if you regard it as "minor"
divergence to deviate from "SGML POV".

People are _already_ very confused with validation, and I'm afraid most of
the time that a novice author spends with validator messages is wasted
time, or worse. (They may mess up their markup just to please a
validator.) As I've explained at
http://www.cs.tut.fi/~jkorpela/html/validation.html
they need to understand what validation is and what it is not, and how it
_could_ be useful. If the W3C validator seriously confuses the issue, this
gets pretty hopeless - and I'm not one of those who don't really care,
saying that well over 99% pages and authors know nothing about validity.

> As I detailed earlier in the message you replied to, the new «fussy» behaviour
> is an optional add-on and equivalent to letting the author of a page play
> "What if?" without having to actually modify his original document.

But such games are not the job of validator. It would be honest to call it
a checker, with an option of performing validation.

Besides, anyone can, for example, run a validation against a DTD
that makes all end tags required, or the <body> tag required. If you wish
to make that easier, you could write suitable DTDs and make them
available.

Changing the basic parsing options (SHORTTAGS) is something different, and
should be kept different. And it seems, if I've understood the discussion
correctly, that the software cannot distinguish "errors" resulting from
this from errors.

Authors already have the option of using XHTML, and it seems that quite a
many of them are taking this option, usually without knowing what they are
doing. It's pointless to start parsing documents declared as HTML
as if they were declared as XHTML.

A tag soup checker that compares an HTML document against known
implementation features would be very useful. But that's completely
different, and involves far more difficult things than details of syntax.

> The correct conclusion if you were to accept the argument at face value would
> be «There is a bug or sub-optimal behaviour in this beta release.»

No, there's a fundamental mistake in what the beta version is doing.
It's not the _how_ but the _what_.

> - - I'm sure Jukka would have phrased it
> differently — if not necessarily any less critically — if given a chance to
> consider the matter in a suitable context (as opposed to suddenly and
> unexpectedly having the new version label his carefully crafted markup
> «Invalid»).

Rest assured that people who have little or no idea of what validation is
- that is, the great majority of authors, including the majority of people
who use validators - will get _really_ confused. Especially since they
have real markup errors too.

-- 
Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/

Received on Friday, 29 August 2003 08:53:25 UTC