Re: Markup Validator Test Suite from olivier Thereaux on 2004-10-15 (public-qa-dev@w3.org from October 2004)

From: olivier Thereaux <ot@w3.org>
Date: Fri, 15 Oct 2004 10:40:44 +0900
To: Bjoern Hoehrmann <derhoermi@gmx.net>
Cc: public-qa-dev@w3.org
Message-Id: <3AEA4907-1E4B-11D9-A806-000393A80896@w3.org>
Bjoern, thanks for developing your thoughts, and sorry for not 
answering earlier, I was rather busy preparing an important meeting 
on... testing.

I generally think what you propose is sound and applicable, with a few 
doubts. In order to explain the doubts, I'll give a small timeline and 
background info on our discussion.

We currently only have a "bag" of test cases, mostly self describing, 
mostly documented in a valid/invalid/the validator chokes on it. We all 
agreed long ago that this was not sufficient, and proceeded to study 
ways to build something better.


A short list of criteria for our test suite would be:
1) Must be applicable to automated tests (yes/no, valid/invalid, 
wellformed/not...) and human-driven tests (UI glitches) and a gray area 
(pattern matching to check that the verbose option actually triggers 
verbose
2) Must cover critical features, known document types
3) Must cover all bugs known at time t
4) Ideally, would be tied one way or another to our bugzilla
5) Should have sub-collections of tests for different "modules"
6) Must evolve with the development of the validator, and be able to 
test old versions as well as the "newest, latest, greatest"

I worked on a system based on cataloguing and documenting test cases, 
which quite sadly never received much more attention that "uh, 
whatever", it seems.


Then, following our effort on modularization, Bjoern suggests going 
with Test::Builder. This is not a bad idea, given that we're pretty 
committed to having a platform in perl for a while anyway, and that 
Test::Builder is a good mechanism to build and run test collections at 
the same time.

T::B would pass criteria 2 and 3, provided we seriously manage the 
tests. Basically, these two criteria depend on us, not so much what 
system we use.

T::B is also a rather elegant solution for 5) [[sub-collections of 
tests for different "modules"]], since each operational module of the 
validator could have its own collection of tests.

6) [[Must evolve with the development of the validator, and be able to 
test old versions as well as the "newest, latest, greatest]] is more of 
a problem for a T::B-based test suite, because it would not be able to 
test versions prior to the switch to the appropriate architecture. This 
in itself is not a showstopper, and after a while we would still be 
able to test different versions of the validator and compare thest 
results...

*However*, as far as I understand, the test suite bundled in the 
distribution of version n of the validator will be able to test version 
n and n-1. But the tests included with version n-1 will only be a 
subset of the tests with n, and in particular they won't include tests 
derived from bug reports received on version n-1. Granted, we know n-1 
will likely fail these tests, they're based on n-1's bugs, but how 
about n-2? To me this is the greatest drawback with having the test 
suite not independent of the product tested (and the reason why I 
wanted an independent catalogue of test cases). Unless there is a way 
to backport the tests? I doubt that.


Noodling on 4) [[ tying to bugzilla ]] a little. Should we use bugzilla 
to store test cases? It does not seem to let us query for test cases, 
though, that's a pity. And we could decide to use the "uri" field for 
the test document's URI, which is probably too limitative given how 
this field is also used to point to a discussion related to the issue. 
So, no automatic tying of bugzilla with the test cases. But we could 
still do something a la SVG WG, setting a rule that no bug is accepted 
if a test case is not attached to it (thus allowing us to add it to our 
repository of test cases).


Finally 1) [[  Must be applicable to automated tests .. and 
human-driven tests .. and a gray area in between ]] is, I think, the 
weak point of a test suite solely based on T::B. Bjoern was the first 
one to point out that a fully automated test framework was not the way 
to go with the validator.

There is, of course, this "gray area" I was referring to, but testing 
UI without a "U" forces us to lame tricks, or at best fragile :

>   sub has_errors
>   {
>       my $self = shift;
>       my $mess = shift || 'Results contain errors';
>       $Test->like($self->{res}->content, qr/<ol id="errors">/, $mess);
>   }

Well, the example above is not so bad if the semantics of the test is 
"did we output the ordered list delimiter", not so much if it's 
supposed to test if there were errors.

I'm also wondering about the fact that, in theory at least, 
Test::Builder *could* be used to create human-centered testing, ranging 
from a crude commandline interfacing telling the user "go to that page, 
do you see a red square? [Y/n]" to perhaps something more 
sophisticated?

-- 
olivier
Received on Friday, 15 October 2004 01:40:49 UTC