- From: olivier Thereaux <ot@w3.org>
- Date: Wed, 26 Sep 2007 15:09:12 +0900
- To: www-validator Community <www-validator@w3.org>
- Cc: staff@validome.org
Hello, While the W3C validator has its own (limited, admittedly) test suite, it is often useful to test it against some other collections of test documents. One such collection is the "test suite" for validome (another excellent (X)html validator) which I use now and then. http://www.validome.org/lang/en/errors/ALL Unfortunately, this "test suite" has a number of serious shortcomings: - some tests are actually wrong - most of the tests are missing references to the spec, making some arguable tests hard to justify - the test results / validators comparison page (linked above) has validome "pass" all the tests, compared to other validators (including W3C's) failing a large number of tests. This tickles my sense of humor to no end... While using the results of a test suite for marketing purposes may be fair game, it is not, however, acceptable when 1) the test suite is tweaked to show only the test one's product passes 2) expected test results are not justified 3) some of the "pass"es are very dubious and 4) the test results are outdated and/or show other tools failing, when their current version passes, sometimes better than the purportedly perfect one. That is not to say that I consider the test suite useless or dishonest. But it gets close to it unless the tests get justified with authoritative references, test results properly dated and tested versions mentioned, and obvious bias avoided. In this perspective, here are the notes I took while checking the current and dev versions of the markup validator. I strongly suggest that the validators comparison page and associated tests be updated, fixed, and clarified according to these notes. If the validome team cannot keep an up to date table of the test results, then this page should be replaced with a list of the test cases, but not make false claims about other tools. * http://www.validome.org/out/ena1004 Validome yields a fatal error here, claiming that the document is not valid. This is actually incorrect, the document is valid XML as far as I can tell. W3C Markup validator passes it as valid XML. * http://www.validome.org/out/ena1005 The comparison page is incorrect. The W3C Markup validator reports the error. * http://www.validome.org/out/ena1007 The comparison page is incorrect. The W3C Markup validator reports the error. Validome also considers the typo in the XML decl as a fatal error, while the W3C Markup Validator shows the offending markup and proceeds to check the document. * http://www.validome.org/out/ena1011 <?xml version="1.0" encoding="#"?> Syntax of encoding in XML decl is bogus. The comparison page is incorrect. The W3C Markup validator reports the error. Validome also considers the typo in the XML decl as a fatal error, while the W3C Markup Validator shows the offending markup and proceeds to check the document. * http://www.validome.org/out/ena1012 <?xml version="1.0" encoding="9ISO-8859-1"?> Syntax of encoding in XML decl is bogus. The comparison page is incorrect. The W3C Markup validator reports the error. Validome also considers the typo in the XML prolog as a fatal error, while the W3C Markup Validator shows the offending markup and proceeds to check the document. * http://www.validome.org/out/ena1014 (extraneous lang attribute in xml decl) The comparison page is incorrect. The W3C Markup validator reports the error. Validome does however provide a better error explanation. * http://www.validome.org/out/ena1015 The comparison page is incorrect. The W3C Markup validator reports the error. * http://www.validome.org/out/ena1017 The comparison page is incorrect. The W3C Markup validator reports the error. * http://www.validome.org/out/ena4010 Now, that one is funny because validome's error message is the one that was deemed "Inscrutable" when the W3C's markup validator reports it for http://www.validome.org/out/ena4002 Pointing out hard-to-comprehend SGML validation messages is not a bad thing, but doing it honestly and consistently, even when validome is at fault, would be better... * http://www.validome.org/out/ena4011 HTML 4.01 document with no system Id. Validome sends a warning... Not necessary per the spec. W3C Markup validator passes validation. Why is W3C validator marked as faulty here? References please? * http://www.validome.org/out/ena4012 XHTML doctype without system Id, but valid public id. Validation should report an error (both validators do), but why does validome count this as a fatal error? * http://www.validome.org/out/ena4019 The comparison page is incorrect. The W3C Markup validator has the proper behavior here, as do others. * http://www.validome.org/out/ena4020 The comparison page is incorrect. The W3C Markup validator has the proper behavior here, as do others. * http://www.validome.org/out/ena4021 Validome is faulty here (why a fatal error?), and the comparison page doesn't mention it. * http://www.validome.org/out/ena4023 Validome says valid. OpenSP and W3C Markup validator says not valid. I'd tend to trust opensp here. The comparison page's claim that validome is the only validator doing the right thing is very dubious. * http://www.validome.org/out/ena4024 Ditto above. The comparison page's claim that validome is the only validator doing the right thing is very dubious. * http://www.validome.org/out/ena2 document served with no http charset, has a BOM and a meta charset claiming to be iso-8859-1 Validome detects charset to be utf-8, sends warning about BOM. W3C validator detects charset to be utf-8, sends warning about BOM. The comparison page claims that validome passes, w3c validator fails. On which grounds, please? * http://www.validome.org/out/ena8 W3C markup validator uses algorithm for charset detection, finds none, uses fallback Validome uses... exactly the same algorith (to the point of having almost the same error message...), finds no charset, yields a fatal error. I'm very curious to know why validome passes and w3c markup validator fails here. I think the opposite: validome's taste for fatal error is a grave failure in usability. * http://www.validome.org/out/ena13 The comparison page is incorrect. The W3C Markup validator has the proper behavior, and reports the mismatch, as far as I can tell. * http://www.validome.org/out/ena14 The comparison page is incorrect. The W3C Markup validator has the proper behavior, and reports the mismatch, as far as I can tell. * http://www.validome.org/out/ena2002 text/xml document with no charset at http level. W3C Markup validator properly follows the RFC and validates as us-ascii. Validome incorrectly sends a fatal error. Note to validome developers: "This Document is not valid." and "fatal error" are plain wrong, here. If you have a separate validator that does XML, don't mislead people into thinking that their document is invalid, instead, why not directly redirecting them to that specific validator? * http://www.validome.org/out/ena2008 - this test is bogus, or the claimed rule "If HTTP-Header charset encoding is missing, but there is one in XML-Declaration, a Meta charset encoding statement must exist." - the comparison page claims that validome alone behaves properly, when it actually behaves just like the others, that is, not respecting the bogus rule claimed by the test. * http://www.validome.org/out/ena2009 - this test is bogus, or the claimed rule "If HTTP-Header charset encoding is missing, but Meta-Tag charset encoding statement exists, then there must be also a XML-Declaration charset encoding statement" needs a serious reference. - validome reports that no encoding was found, and used a fallback. This is not correct - there is a meta charset info. - The comparison page is incorrect - the w3c markup validator is having the perfectly proper behavior here. * http://www.validome.org/out/ena2010 - this test is bogus. "If there is a charset encoding statement in XML-Declaration as well as in a Meta-Tag, the XML-Declaration charset encoding will be used. HTTP-Header charset encoding is irrelevant in this case." is just untrue. HTTP charset info always gets precedence, and is never "irrelevant". - validome's behavior is incorrect, yet reported as correct - The comparison page is incorrect. The W3C Markup validator has the proper behavior here. * http://www.validome.org/out/ena2041 The comparison page is incorrect. The W3C Markup validator has the proper behavior here. * http://www.validome.org/out/ena5006 (ditto 5007 5008 5009 5010 5011 2025 5026 5027 5028) I strongly disagree that the W3C Markup's validator behavior is incorrect, here. It follows the rules of HTTP precedence closely, and reports the discrepancy between doctype and media type. * http://www.validome.org/out/ena5020 I strongly disagree that the W3C Markup's validator behavior is incorrect, here. text/html is allowed for XHTML 1.0 * http://www.validome.org/out/ena5021 The comparison page is incorrect. Validome and W3C Markup Validator both mention that XHTML1.1 should not be served as text/html. * http://www.validome.org/out/ena5030 The comparison page is incorrect: it claims that the W3C validator does not explain why it parses as SGML (it does). The claim that validome is doing the right thing is also dubious, as validome is actually not mentioning any problem in parsing mode. * http://www.validome.org/out/ena6030 The comparison page is incorrect. The W3C markup validator not only checks for the presence of xmlns in XHTML, it also give an example of what it should look like, and reference to the spec. Validome doesn't. * http://www.validome.org/out/ena7003 I'd like to see a reference for this. * http://www.validome.org/out/ena7005 (and 7006) This has nothing to do with validation. If validome emulates some of the features of a link checker, compare it to link checkers, not validator. This test is moot. * http://www.validome.org/out/ena3002 This test is bogus. Sorry. An XML declaration also happens to be a proper SGML PI. Giving a warning asking the HTML4 author "are you sure you want this here" may be a good idea. Making this a fatal error is wrong, wrong, wrong. * http://www.validome.org/out/ena3006 The comparison page is incorrect. Output of a warning for a shorttag construct is a good thing (dev version of w3c validator actually does it) but not required. The current W3C Validator's behavior is not wrong. * http://www.validome.org/out/ena3007 ditto. Learn about shorttags. Validome is actually wrong here, this should not be reported as an error, at most a warning. HTH, olivier -- olivier Thereaux - W3C - http://www.w3.org/People/olivier/ W3C Open Source Software: http://www.w3.org/Status
Received on Wednesday, 26 September 2007 06:09:23 UTC