Re: Test suite validity from Jeremy Carroll on 2004-03-12 (www-qa@w3.org from March 2004)

From: Jeremy Carroll <jjc@hplb.hpl.hp.com>
Date: Fri, 12 Mar 2004 16:14:02 +0000
To: Ian Hickson <ian@hixie.ch>
Cc: www-qa@w3.org
Message-ID: <4051E1CA.7050807@hplb.hpl.hp.com>
I note that one of the points I am relatively isolated on, is the 
advantages and disadvantages of putting tests explicitly as HTML in 
Technical Reports.


One advantage is that such TRs are routinely put through pubrules and the 
link checker etc. which does some of the quality control for you.

For example the rule to use either URLs that work or example.{org,net,com} 
ones just happens.

I think it is important that test is seen as part of quality as a whole. As 
an organization W3C knows more about document quality, and quality of a 
publication process than about test, so piggy backing on that makes sense 
to me. e.g. the change control process for documents might not be ideal for 
tests but at least there is one.

Having said that - the only way to really ensure that tests are good is to 
have enough good implementations actually run them and to fix the bugs in 
the tests that are found. To get to this point means that the test 
developers have to make the overall test suite useful enough that 
implementors find it a win-win to be engaged enough to repeatedly run the 
test suite.

Jeremy







Ian Hickson wrote:

> 
> A week ago I gave a quick talk on the panel on Test Suites at the W3C
> Plenary Day [1], and one of the points I made was the principle that test
> suites should be valid [2]. To illustrate this, I noted that about two
> years ago Tantek found 18 test suites (mostly linked to from the W3C
> pages) that were invalid, and that 12 of those test suites were in fact
> still invalid today.
> 
> I randomly clicked on one of those 12, and it happened to be the NIST DOM
> test suite [3], and yes, it was still invalid.
> 
> The last panelist (Mary Brady, NIST) claimed that what was shown to be
> invalid was simply an old version of the test suite, and that the newest
> version of the test suite *is* valid [4].
> 
> This turns to be a false assertion and thus I felt obligated to clarify
> that here.
> 
> It is true that the page from which the new tests can be downloaded is now
> valid, however, that is simply an overview page, and is neither a test
> suite nor a test itself.
> 
> In fact, the tests themselves are not even available at their own URIs
> (despite these being tests hosted by the World Wide _Web_ Consortium).
> This means it is impossible to link to a particular test, and therefore to
> validate it using the W3C validator [5], and provide a URI that
> demonstrates its validity or lack thereof in this case.
> 
> However, curiosity got the best of me so I eventually downloaded the tests,
> and found a number of problems:
> 
> 1. The tests _are_ invalid. (No DOCTYPE, for one, in the HTML
>    versions.)
> 
> 2. Every test is in the region of 4k, about 3k more than should be
>    necessary for even a complex test.
> 
> 3. The harness is so complicated it doesn't even support Opera. (Test
>    harnesses shouldn't even remotely be complicated enough that that
>    is a problem, as I mentioned in my talk.)
> 
> 4. I couldn't determine the pass condition of any of the tests I
>    randomly picked. As far as I can tell they _require_ the harness.
>    This makes the tests basically useless for QA purposes.
> 
> 5. Even on a supported browser I couldn't determine how to use the
>    harness without detailed examination of the instructions.
> 
> 6. Even once I'd got that working, all I got was a list of failed
>    tests, with no easy way of getting from there to the page to
>    examine the exact problem (e.g. with a debugger).
> 
> 7. I couldn't understand the tests even after looking at them, despite
>    very good familiarity with the DOM specifications.
> 
> To some extent this is not new information. It has been pointed out to me
> that over a year and half ago Brad Pettit (Microsoft) proposed a set of
> principles for the DOM test suites based on the document that I used as
> the basis for my presentation:
> 
>    http://lists.w3.org/Archives/Public/www-dom/2002JulSep/0080.html
> 
> It covered a number of these problems (tests are invalid, harness is too
> complex, harness is required, tests are not atomic, test suite is not easy
> to use), and yet none of these problems were subsequently addressed.
> 
> Sadly it appears this is not the only test suite with these problems. I
> looked at some other test suites, such as the SMIL test suite, and was
> shocked to see that many of the tests aren't even valid _SMIL_, let alone
> HTML. For instance, this test uses invalid, non-W3C, IE-specific hooks to
> test a SMIL feature, instead of using the SMIL namespace (note the the
> fact that the MIME type is text/html, not an XML MIME type):
> 
>    http://www.w3.org/2001/SMIL20/testsuite/interop2/animation/set_attName_left_target_begin_dur_repeatDur.htm
> 
> I could find nothing in the SMIL specification to suggest that this test
> should even remotely work as described. (If I am wrong in this regard I
> would love to be corrected.)
> 
> (I should admit that the Selectors test suite [6], for which I am
> responsible, is also known to contain a number of validation errors [7].
> These were noticed recently and are being corrected [8].)
> 
> In conclusion, I am concerned that even amongst those of us who subscribe
> to the notions I described in my talk last week, there is a large gap
> between the theory and the practice.
> 
> -- References --
> [1] http://www.w3.org/2004/03/TechPlenAgenda.html
> [2] http://www.w3.org/Style/CSS/Test/testsuitedocumentation.html#validtests
> [3] http://xw2k.sdct.itl.nist.gov/xml/dom-test-suite.html
> [4] available from http://www.w3.org/DOM/Test/
> [5] http://validator.w3.org/
> [6] http://www.w3.org/Style/CSS/Test/CSS3/Selectors/current/
> [7] http://lists.w3.org/Archives/Member/w3c-css-wg/2004JanMar/0274.html
> [8] http://lists.w3.org/Archives/Member/w3c-css-wg/2004JanMar/0275.html
> 
>
Received on Friday, 12 March 2004 11:15:29 UTC