Re: Automated Test Runner from James Graham on 2011-02-24 (public-html-testsuite@w3.org from February 2011)

From: James Graham <jgraham@opera.com>
Date: Thu, 24 Feb 2011 23:57:46 +0100 (CET)
To: "L. David Baron" <dbaron@dbaron.org>
cc: James Graham <jgraham@opera.com>, Kris Krueger <krisk@microsoft.com>, Anne van Kesteren <annevk@opera.com>, "public-html-testsuite@w3.org" <public-html-testsuite@w3.org>, "Jonas Sicking (jonas@sicking.cc)" <jonas@sicking.cc>
Message-ID: <alpine.DEB.2.00.1102242329300.2900@sirius>

On Fri, 18 Feb 2011, L. David Baron wrote:

>>> The number of tests isn't important (and is not a good measure of
>>> testing coverage); what matters is whether any of them failed.
>>
>> The number of tests is important. If you expect a test file to
>> return 100 results and you only get 50 then something went wrong,
>> even if all 50 results were reported as pass.
>>
>> I agree that forcing people to add this metadata manually is not the
>> nicest approach. But I can't think of a better one either.
>
> Two things solve the problem of a test unexpectedly terminating
> without actually finishing:
>
> (1) the harness goes on to the next test when the current test
> tells the harness it that it is finished, so if the test never says
> it's finished, the run stops.  (And this is needed anyway to run
> anywhere close to efficiently; alloting tests a fixed amount of
> time is a huge waste of time.)

The current design has a per-file timeout so if the results are never 
reported the test runner should be able to keep going. Of course if you 
trigger a serious bug like deadlocking the script scheduler then you might 
not be able to recover (unless you use an Opera-like approach of driving 
tests from outside the browser. This is obviously not cross-browser though 
and so won't work for the public testsuite).

Such a timeout doesn't fix the underlying problem however. If I have

<script src=sometests.js></script>
<script src=somemoretests.js></script>

and for whatever reason somemoretests.js fails to load, the test page 
could run to completion and utterly fail to notice that many of the tests 
were never run. I don't doubt that there are other ways of achieving this 
without a whole script failing to execute.

> (2) an onerror handler catches uncaught exceptions or script parse
> errors, counts them as a failure, and goes on.

This seems like a dangerous approach. Some tests may require syntactically 
invalid scripts because that is what they test (tests for window.onerror 
itself seem like a good example of this). Baking in the assumption that 
anything caught by window.onerror is a fault seems unworkable, so one 
would have to come up with some complex scheme where some things caught by 
window.onerror were expected and some were not.

I am also reluctant to depend on window.onerror because it is a feature of 
the specification that we are trying to test. As far as possible (and it 
is clearly not possible to be perfect in this respect) the underlying test 
harness tries to depend on as few HTML/DOM features as possible. 
window.onerror seems particularly bad as it is not uniformly implemented 
and has a patchy security history thus suggesting it may be further locked 
down in the future (e.g. by not reporting errors for cross-domain 
scripts).

Received on Thursday, 24 February 2011 22:58:36 UTC