Re: Request for Feedback On Test Harness

On 11/30/2010 11:35 AM, David Carlisle wrote:
> On 30/11/2010 09:45, James Graham wrote:
>> I am looking for some feedback on the test harness script
>> testharness.js (note that this would better have been called a
>
> I've only written some very simplistic tests using this harness, but I
> have to say it worked as expected and the syntactic overhead required to
> wrap each test/assertion was no effort at all compared to the effort in
> thinking of what to test.
>
> The main things that I was confused about are perhaps due to lack of
> documentation (or not, I can't tell, hence the confusion) so some
> questions:
>
> When should this harness be used? (Many of the currently approved tests
> don't appear to use it.)

Yes, there are tests that predate the harness. The long term goal is to 
use the same harness for everything, which might require some 
refactoring of existing tests.

> How are results collected? The test produces a nice pass/fail test in
> display in the browser but is there is a mechanism for collecting the
> results from different test files and across different test runs, I
> assume so, but didn't understand how that was supposed to work.

Yes, although this is somewhat a work in progress. If tests using the 
harness are run in a nested browsing context, they will attempt to 
report the results back to the top level browsing context. Therefore the 
results from multiple test files can be collected by loading the test 
file in an iframe, waiting for the results, loading the next file and so 
on.

> The harness (and the test suite generally) seems based on a binary
> pass/fail state for tests. The MathML and Xquery test suites had a
> standard reporting form (xml, but json would have done:-) and possible
> states for each test of pass/fail/partial-pass/broken/untested this
> allows combined results display such as
>
> http://www.w3.org/Math/testsuite/results/tests.html
>
> for mathml

Interesting. We decided that binary states were the simplest thing that 
could possibly work, so the initial design only allows for those two 
states (well actually 3, PASS, FAIL and TIMEOUT, but the difference 
between TIMEOUT and FAIL is primarily for people writing tests and I 
would not expect a report to distinguish them). What did you use the 
additional states for? The difference between "fail" and "broken" seems 
non-obvious to me; one would not expect test files that are inherently 
broken to be long-lived. And "some-passed" seems like "fail" assuming 
sufficiently atomic tests.

Received on Tuesday, 30 November 2010 10:53:58 UTC