Validator(s)' Test suite - requirements (thoughts) from olivier Thereaux on 2004-02-10 (public-qa-dev@w3.org from February 2004)

From: olivier Thereaux <ot@w3.org>
Date: Tue, 10 Feb 2004 10:44:54 +0900
To: public-qa-dev@w3.org
Message-Id: <B99F8771-5B6A-11D8-998D-000393A63FC8@w3.org>
Here are my thoughts on the topic of a test suite for our various 
validators (including checklink by stretching the definition a bit).

I'm mixing the ideas of a test suite and test framework because I 
believe the former must be designed precisely enough to allow us to 
build the latter. If we don't fix this requirement for the test suite, 
I guess a simple list of document (or something a bit more complex, as 
Björn just did).

We want a system that allows, in priority (use cases):
- a large (enough) collection of test
- (very easy) contribution of a test case with, ideally, each bug report
- testing of several instances of the validator + comparison of 
different instances + comparison with expected result (so a fixed list 
of validator-uri is a no-no)
- Classification of test. possibility to run only a subset of the suite

The main issue... Is to build a test suite for a test tool (duh). This 
brings one (or even several) levels of complexity that I don't think 
are too common in software testing (but I don't claim to be an espert 
on the matter).

Let us just pretend for a moment that a test case is a document (I have 
ideas to refine that, which I will explain later). Considering a test 
case linked to the validation engine (as opposed to UI, formatting, 
etc, which the system should also address). What's the expected result? 
The result that the production validator gives, and which, for 
consistency, the tested version is supposed to give, or is it the known 
(or guessed) validity of the test case wrt a given DTD? This is not 
such a complicated question, I guess, and once we make a clear 
distinction between the expected (validity) and usual results, it is 
possible to envision having the expected result "hardcoded" with the 
test case, and keep former results (including the reference results 
from the current prod tool) in EARL or whatever other format suits the 
purpose (EARL looks quite good).

What would the system do (taken and completed from Björn's list of 
200402 meeting):
  - read list of cases
  - filter (keep only subset we want to test)
  - run the tool
	* get outcome (EARL format?)
	* scrape results (in whatever output we want to test, usually XHTML I 
suppose)
  - Check UI [if included in the test description?] => escaping / URIs / 
dialogs / proper output according to options (source tree etc.)
  - Compare test result with expected outcome
  - compare test result with known results from other instances

I gave an incomplete listing of UI tests, and the process assumes that 
we also test the validation engine. I remember being told that testing 
the core (openSP / libs) was not ours to do, and I guess to some extent 
that is true, but I suppose a few tests could be used to check that at 
least all these components are present (e.g that the lib is up to date 
and that all the right DTDs are known)

 From all the thoughts above I can start outlining how a test would be 
described:
- URI of the tested document
- tools' options (to be appended/sent in a GET/POST)
- DTD used
- validity wrt DTD used
+ well formed (if XML)*
- long description (not fond of having the document describe itself)

(*)e.g v.w.o/dev/tests/xhtml-mathml2.html is well formed but not valid 
(AFAIK)



All for now. Please give your opinion or provide additions so that I 
can write a complete req. document, for the record.

thanks.
-- 
olivier
Received on Monday, 9 February 2004 20:45:02 UTC