W3C home > Mailing lists > Public > www-qa@w3.org > March 2004

Test suite validity

From: Ian Hickson <ian@hixie.ch>
Date: Fri, 12 Mar 2004 12:35:32 +0000 (UTC)
To: www-qa@w3.org
Cc: Tantek Çelik <tantek@cs.stanford.edu>, dom@w3.org, wchang@nist.gov, tmichel@w3.org, mary.brady@nist.gov, bradp@microsoft.com
Message-ID: <Pine.LNX.4.58.0403121204370.23385@dhalsim.dreamhost.com>


A week ago I gave a quick talk on the panel on Test Suites at the W3C
Plenary Day [1], and one of the points I made was the principle that test
suites should be valid [2]. To illustrate this, I noted that about two
years ago Tantek found 18 test suites (mostly linked to from the W3C
pages) that were invalid, and that 12 of those test suites were in fact
still invalid today.

I randomly clicked on one of those 12, and it happened to be the NIST DOM
test suite [3], and yes, it was still invalid.

The last panelist (Mary Brady, NIST) claimed that what was shown to be
invalid was simply an old version of the test suite, and that the newest
version of the test suite *is* valid [4].

This turns to be a false assertion and thus I felt obligated to clarify
that here.

It is true that the page from which the new tests can be downloaded is now
valid, however, that is simply an overview page, and is neither a test
suite nor a test itself.

In fact, the tests themselves are not even available at their own URIs
(despite these being tests hosted by the World Wide _Web_ Consortium).
This means it is impossible to link to a particular test, and therefore to
validate it using the W3C validator [5], and provide a URI that
demonstrates its validity or lack thereof in this case.

However, curiosity got the best of me so I eventually downloaded the tests,
and found a number of problems:

1. The tests _are_ invalid. (No DOCTYPE, for one, in the HTML
   versions.)

2. Every test is in the region of 4k, about 3k more than should be
   necessary for even a complex test.

3. The harness is so complicated it doesn't even support Opera. (Test
   harnesses shouldn't even remotely be complicated enough that that
   is a problem, as I mentioned in my talk.)

4. I couldn't determine the pass condition of any of the tests I
   randomly picked. As far as I can tell they _require_ the harness.
   This makes the tests basically useless for QA purposes.

5. Even on a supported browser I couldn't determine how to use the
   harness without detailed examination of the instructions.

6. Even once I'd got that working, all I got was a list of failed
   tests, with no easy way of getting from there to the page to
   examine the exact problem (e.g. with a debugger).

7. I couldn't understand the tests even after looking at them, despite
   very good familiarity with the DOM specifications.

To some extent this is not new information. It has been pointed out to me
that over a year and half ago Brad Pettit (Microsoft) proposed a set of
principles for the DOM test suites based on the document that I used as
the basis for my presentation:

   http://lists.w3.org/Archives/Public/www-dom/2002JulSep/0080.html

It covered a number of these problems (tests are invalid, harness is too
complex, harness is required, tests are not atomic, test suite is not easy
to use), and yet none of these problems were subsequently addressed.

Sadly it appears this is not the only test suite with these problems. I
looked at some other test suites, such as the SMIL test suite, and was
shocked to see that many of the tests aren't even valid _SMIL_, let alone
HTML. For instance, this test uses invalid, non-W3C, IE-specific hooks to
test a SMIL feature, instead of using the SMIL namespace (note the the
fact that the MIME type is text/html, not an XML MIME type):

   http://www.w3.org/2001/SMIL20/testsuite/interop2/animation/set_attName_left_target_begin_dur_repeatDur.htm

I could find nothing in the SMIL specification to suggest that this test
should even remotely work as described. (If I am wrong in this regard I
would love to be corrected.)

(I should admit that the Selectors test suite [6], for which I am
responsible, is also known to contain a number of validation errors [7].
These were noticed recently and are being corrected [8].)

In conclusion, I am concerned that even amongst those of us who subscribe
to the notions I described in my talk last week, there is a large gap
between the theory and the practice.

-- References --
[1] http://www.w3.org/2004/03/TechPlenAgenda.html
[2] http://www.w3.org/Style/CSS/Test/testsuitedocumentation.html#validtests
[3] http://xw2k.sdct.itl.nist.gov/xml/dom-test-suite.html
[4] available from http://www.w3.org/DOM/Test/
[5] http://validator.w3.org/
[6] http://www.w3.org/Style/CSS/Test/CSS3/Selectors/current/
[7] http://lists.w3.org/Archives/Member/w3c-css-wg/2004JanMar/0274.html
[8] http://lists.w3.org/Archives/Member/w3c-css-wg/2004JanMar/0275.html

-- 
Ian Hickson                                      )\._.,--....,'``.    fL
U+1047E                                         /,   _.. \   _\  ;`._ ,.
http://index.hixie.ch/                         `._.-(,_..'--(,_..'`-.;.'
Received on Friday, 12 March 2004 07:35:34 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Sunday, 6 December 2009 12:14:00 GMT