Editorial remarks, DD 20040605

This is a WG draft and is primarily intended at WG-internal circulation for comments. It does not have any worked on front matter section, nor the acknowledgments, references or appendices sections.

Please provide comments on content and before the publication, the document will have been turned into a QAF-style document and have the sections mentioned above.

Introduction

This document defines principles & practices to support the creation of useful and usable conformance test-suites. While much of its contents are applicable to other forms of testing, the scope of this revision of the guidelines is limited to conformance testing. In particular, it follows directions pointed out in other QAF documents [@@ QAH and SpecGL]. The document also mentions other types of testing that can be used to derive good ideas and practices; however the primary scope is that of conformance testing.

Principle #1: Users must understand the scope of the test suite

Potential users of the test suite need to know whether this test suite applies to them, the extent to which they can rely on it, and where they might need to focus additional testing efforts. In order for this to be achieved, the test suite must indicate what specification it tests, what the prerequisites for running it are, how much of the test suite is being tested, and what the results should be.

A clear understanding of test suite scope is equally important to the WG as it is to end users, especially since the document targets and discusses conformance testing. Understanding test suite scope means stating what is to be tested and making that clear to those who use the tests at a later stage. The instrumental nature of documenting the test suite is therefore particularly important. However, it is equally important that the test suite producers work in a uniform conceptual framework in order to produce test suites that are clearly targeted and focused.

Requirement 1.1: Document the scope and goals of the test suite

Specify what specifications are covered and what testing strategy was adopted (if any in particular).

The specifications covered are primarily the one(s) the test suite(s) are written for, but it is also important to mention what other specifications the successful implementation of the test suite at hand presupposes support for (an example is parts of the DOM Test Suites requiring that the implementation being tested supports HTML 4.0, since the DOM specification build on that).

Testing strategies can de divided into the following main areas: conformance testing (primary target of this document), functional testing, interoperability testing, performance testing, stress testgin and usability testing. [@@ to be worked on].

Requirement 1.2: Provide coverage information

Users need to know what is covered and what is not. It must be possible to map individual tests back to the specification; if a test fails, the user of the test suite must understand what portion of the implementation is at fault. Mapping entails both being able to resolve what part of a specification is being tested by a particular test (backward pointing) as well as being able to, if present, use test assertions to look at tests derived from those (forward pointing). Also, try to make available a map indicating what parts of the specification have tests written for them, since this is a clear way to distinguish complete from incomplete test suites (which in turn is very importan to be able to conduct conformance testing).

Good practice: Assertion lists are an effective way of documenting tests and mapping them back to the spec. In case the specification authors work together or in parallel with the test authors, it is simpler to insert test assertions in the specification. Assertions are therefore a very good tool to ascertain strong links between specifications and subsequent tests.

Principle #2: Test execution results must be repeatable and reproducible

If test results are not repeatable and reproducible they cannot be relied upon, and they cannot be compared with other execution results. Test suite results should be the same for two users with the same setup.

Requirement 2.1: Define the contents of the test suite

The components of a particular revision of the test suite must be unambiguously identified. It is not sufficient to point users to a web site that is randomly updated and that contains an amorphous collection of test materials. Test materials must be packaged together into a "test suite" and published with a version number. The test suite must contain documentation that describes its contents and explains how to use it.

Potential components of test suite

Requirement 2.2: Specify what tests are to be run

It must be possible to determine what tests must be executed for a particular implementation, allowing non-applicable tests to be filtered out. Non-applicable tests can be tests that presuppose support for optional functionality in the specification, for example.

Best practice: defining relevant metadata enables filtering.

Requirement 2.3: Document how to execute tests

The test suite documentation must clearly explain how to execute the tests in a repeatable and reproducible fashion.

[Requirements 2.2 and 2.3 imply that two different users will execute the same tests in the same manner on a particular implementation.]

Best Practice: Either provide a test-harness and supporting tools, libraries, framework, or provide sufficient metadata and documentation to allow a test harness to be constructed and allow for the test suite to be successfully run producing equivalent results.

Requirement 2.4: Tests should report their results in a meaningful and consistent manner

Tests should report status (passed, failed, not run, etc.) in an unambigous and consistent manner.

Best Practice: If properly marked up, tests can provide information on what went wrong (for example, a test can contain specific clauses on expected result that get triggered if the execution fails), helping the implementor to debug their implementation.

Requirement 2.5: Ensure repeatability and reproduceability

Test suite execution must be the same for two users with identical setup (all other things equal).

Best Practice: Before release, conduct extensive test suite execution to make sure test suites are indeed repeatable and reproducible.

Principle #3: Test suites must evolve over time

Test suites must evolve to reflect changes in the specification, as problems are identified and fixed, as coverage is increased, or to address revisions and errata of applicable specifications.

Requirement 3.1: Plan for multiple releases

Plan for multiple releases of the test suite (separate versions of the test suite itself is the primary purpose). Ideally, a new version of the test suite should be released for each revision/errata of the specification. Version numbers should be supplied. Users should understand which version of the test suite is appropriate for a particular implementation.

Requirement 3.2: Accept and respond to bug-reports

Any serious test suite effort should be aided by proper feedback and comments mechanisms, allowing for the WG to respond quickly to issues that may arise during test development (both on actual test authoring, but also concerning specification interpretation). This should be an easily accessible and, if possible, public forum in which users can give their feedback.

Best practice: implement a formal bug-tracking or issue-tracking process to manage bug-reports.Users must be provided with a formal channel for reporting problems in the test materials (tests, test harness, documentation). Note that problems reported against the test suite may reveal problems in the specification, for example incompatible interpretations of the specification or just ambiguities. Problems may be addressed by:

Requirement 3.3: Maintain test suite beyound WG life

Implementations are written for specifications that do not longer have chartered WGs. Since test suites will be used to make conformance claims, a different life cycle than that of the WG must be foreseen

All these are test suite-specific process areas that need to be adressed without having to implement in a specific way.

Best practice: implement a formal bug-tracking or issue-tracking process to manage bug-reports.

Best practice: Patching an existing test-suite is difficult; re-releasing the entire test-suite, even if the changes are minor, might be the simplest and least confusing way to release updates.

Operational guidelines (incorporate into material above where possible)

Treat test development like product development - it is (or should be) a formal engineering process.

Editorial note: I believe that QAH points from too many sections to process aspects of the TS, in particular in section 6 of that document. This will hopefully be adressed when documents are brought in sync.

For the highest-quality test suite: