Re: Validators, Validation chart

Hi Rick,

On Apr 20, 2006, at 8:10, Rick Stanley wrote:
> in the chart on:
> http://www.validome.org/lang/en/errors/ALL
>
> ther are many discrepencies betwen the W3c validator, Validome, and the
> other validators, WDG, and Site Valet.

The validators by WDG, W3C, Validome and Webthing (Valet) are all fine  
tools, very reliable for all but some complex cases. Those complex  
cases, often caused by unclear (or, sometimes, erroneous)  
specifications, are the main cause for discrepancies between the tools:  
when the developer is left without a clearly documented choice, he or  
she has to make an "educated guess" as to what is the best decision.  
Worse even are cases when two specifications collide, with no clear  
rule on which has precedence. These are fortunately rare, because  
there's a quality process in the making of the specs, but still, they  
happen, and it makes it difficult to create perfect checking tools in  
such a context.

The document you mention was unknown to me. It looks like a great  
collection of tests, indeed, the validome team did a great job  
compiling it.

The results table, however, is another thing. Generally speaking one  
should be wary of test results used as promotional material. I wish the  
Validome team used their table of tests as a development helper, rather  
than an advertisement. Claiming to be 100% perfect and bug-free is  
rather dubious, as is the systematic choice, whenever the different  
tools have a different behavior, to take validome as the obvious  
correct reference.


I have not yet looked at all the test cases, many look ok, but a few of  
them seem wrong...
Picking at random among the ones reporting the W3C's markup validator  
as "faulty":

* System-ID missing (at PUBLIC) in HTML-Document
http://www.validome.org/lang/en/errors/DOCTYPE/4011
The system id is not mandatory, and while it is strongly recommended  
for non-standard document types, there's no rule that I know of that  
forces a parser (or a validator) to report it. The fact that validome  
is throwing a warning here may or may not be a good idea (some might  
say there should not be a warning if there is no risk of problem), but  
that does not make the other tools wrong.

* Absolute no charset encoding statement
http://www.validome.org/lang/en/errors/HTML-CHARSET/8
Validome throws a fatal error when no charset declaration is found, and  
declares the document invalid.
First, the document *is* valid: there is no constraint on character  
encoding for validation, it's just that if no charset can be found,  
it's hard to read, and therefore parse, the document.
Also, the W3C markup validator used to throw a fatal error in such  
cases, too, but our users told us this was awfully unhelpful, so the  
validator is now trying tentative validation using a fallback charset,  
as do valet and the WDG validator. I'm not sure I'd call validome right  
and the other three wrong on this one ;).

* XML- and Meta-charset encoding are different to HTTP-Header charset  
encoding
http://www.validome.org/lang/en/errors/XML-CHARSET/2010
Plain wrong. HTTP headers have precedence over other charset  
declaration methods.
e.g http://www.w3.org/TR/REC-html40/charset.html#h-5.2.2

* XHTML 1.0 - Document with incorrect MIME-Type "image/svg"
http://www.validome.org/lang/en/errors/MIME_TYPES/5025
That's correct, it's a bug in the w3c's validator indeed. There is a  
mechanism to avoid such issues, but it's not working here for some  
reason. Didn't know about it. Good catch.

* Errors in Attribute-values
This is interesting, because these tests are NOT about validation, but  
conformance. There's no doubt that checking for them is useful, but  
when validome says http://www.validome.org/out/ena6001 is not valid,  
it's a erroneous statement. The document is not conformant, but it's  
valid.

etc...

All in all, there appears to be incongruence on how the Validome team  
interpreted the specs. I think it would have been useful if the test  
cases had more links to the reference specifications so that the  
results can be shown to be objective. In the absence of such  
references, the claim that Validome is right and the other  
implementations wrong is not always acceptable.

There are also some real cases highlighting bugs in the other  
validators. I think a good part of the ones for the w3c markup  
validator are already known and mentioned in our public issues database  
[1], but I will go through the list and make sure all the bugs are  
properly recorded. So, thank you very much for bringing this to our  
attention.

[1] http://www.w3.org/Bugs/Public/

You said:
> I would have thought that the W3C Validator would be the ultimate  
> source.

The ultimate reference, always, is the specification, and in the case  
of markup, the many specifications, from URI and SGML, to HTTP, to the  
various markup languages. The W3C validator is not a perfect reference  
- it certainly has some bugs. Neither is any other of the existing  
markup checkers perfect, and they can never be, since there is always a  
(minimal) amount of guessing, room for interpretation, and  
accommodating for colliding specifications.

The W3C markup validator:
* has an open source code [2]
* has a public bug list [3], Everyone (you, validome developers, anyone  
really) can submit new found bugs...
When you cannot be perfect, it's better to be honest and accountable  
about it.

[2] http://dev.w3.org/cvsweb/validator/
[3]  
http://www.w3.org/Bugs/Public/buglist.cgi? 
&bug_status=__open__&product=Validator

cheers,
olivier
-- 
olivier Thereaux - W3C - http://www.w3.org/People/olivier/
W3C Open Source Software: http://www.w3.org/Status

Received on Friday, 21 April 2006 06:59:41 UTC