- From: James Graham <jgraham@opera.com>
- Date: Tue, 30 Nov 2010 10:45:00 +0100
- To: "'public-html-testsuite@w3.org'" <public-html-testsuite@w3.org>
I am looking for some feedback on the test harness script testharness.js (note that this would better have been called a "framework", but I will continue to use the term harness throughout). In particular if there are requirements that people have for writing tests that have not been considered or there are rough edges in the harness that should be fixed, it would be good to know about them now so that the problems can be addressed. Primarily I am interested in feedback about the design and API since those are harder to fix later. However comments on the implementation are also welcome; I know of a few problems already that I intend to address. To frame the discussion in context, I think it will be useful for me to elaborate on the design goals of the current harness, and provide some details of how it tries to meet them. == One or Multiple Tests per File == Although it is often possible to have just a single test per file, in some cases this is not efficient e.g. if generating many tests from some relatively small amount of data. Nevertheless it should be possible to regard the tests as independent from the point of view of collecting results i.e. it should not be necessary to collapse many tests down into a single result just to keep the test harness happy. Obviously people using this ability have to be careful not to make one test depend on state created by another test in the same file regardless of what happens in that test. For this reason the harness separates the concept of "test" from the concept of "assertion". One may have multiple tests per file and, for readability (see below) each may have multiple assertions. It also strengthens the requirement (below) to catch all errors in each test so they don't affect other tests. == Suitable for writing both synchronous and asynchronous tests == Many DOM APIs are asynchronous and testing these APIs must be well supported by the test harness. It is also a useful optimization to be able to write simple tests in a sync. fashion because e.g. checking that some DOM attribute has a given value for some input markup is a common sort of problem and tests should be correspondingly easy to write. The harness has explicit support for both sync and async tests through the sync and async methods. == Minimal Dependence on Correct HTML Implementation == If the test harness itself depends on HTML features being correctly implemented, it is rather hard to use it to test those features. As far as possible it has been designed to only use ECMAScript and DOM Core features. == Robust Against Unexpected Failure == Tests may not fail just because of the particular assertions that are being tested, but because of implementation bugs affecting the test, or because of some unexpected brokenness caused by the test environment. In general it is not a safe assumption that the test author has verified the test only fails in the expected way in all implementations that may be of interest, or that they have written the test to be defensive against unexpected errors. As far as possible, such errors should affect the minimum number of tests i.e. on a page containing multiple tests a single unexpected failure should not stop all other tests from executing. To deal with this problem tests are run in a try / catch block. Any error, caused by an assertion or caused by an unexpected bug in the implementation, is caught and causes the test to fail. Other tests on the same page remain unaffected by the error. == Consistent, easy to read assertions == In order to make it clear what a test is aiming to check, a rich, descriptive assertion API is helpful. In particular, avoiding a style where test authors are tempted to do passed = condition1 && condition2 && condition3; assert(passed) is desirable since this can make tests complex to follow. Such a rich API also allows common, complex, operations to be factored out into the harness rather than reimplemented in different ways by each individual author. A good example of this is testing that a property is "readonly". This can be done more or less comprehensively and, depending on WebIDL, may change its exact meaning (this happened recently for example). By factoring out a test for readonly into a specific assertion, all tests for readonly attributes can be made in the same way and get updated together if necessary. This also helps to make tests written by a diverse range of authors easier to compare since it follows the pythonic principle that "there should be one (and preferably only one) obvious way to do it". To this end, the harness has a rich set of assertions that can be invoked using assert_* functions (currently fewer than I would like, but that is a quality of implementation issue that can be fixed). Assertions like assert_readonly == Good Error Reporting == As far as possible, the harness should make it clear what failed and why. In general it is not possible to get the stack out of an exception in a generic way, but since there are high-level assertion functions the harness can report exactly what was expected and what occurred instead. Individual assertions can also be labelled to further improve error reporting. In the case of unexpected errors, the error message from the error object is displayed. == Easy to Write Tests == Tests should be as easy as possible to write, so that people mostly write tests that use the harness well and are easy to follow, and so that it is not too burdensome to write the tests. This is aided by the rich assertion API since one does not have to repeat the code to correctly check for various things again and again. There is some overhead in the harness due to the need to structure async tests into steps and the use of callback functions to wrap individual steps. However given the other requirements it is difficult to see how to avoid this; a great fraction of the overhead is purely javascript syntax ("function() {}"), and, I think, the need to structure tests in a consistent way is a boon to readability.
Received on Tuesday, 30 November 2010 09:45:40 UTC