Suggested Changes to Testing Guidelines from Mark Skall on 2002-07-08 (www-qa-wg@w3.org from July 2002)

From: Mark Skall <mark.skall@nist.gov>
Date: Mon, 08 Jul 2002 18:00:54 -0400
To: www-qa-wg@w3.org, kirillg@microsoft.com
Message-Id: <5.0.0.25.2.20020708170909.029489f0@mailserver.nist.gov>
Hi, Kirill and everyone else,

I've made a lot of suggested changes to the Testing Guidelines.  As one of 
the co-editors, I I've taken the liberty of re-writing much of the Spec 
Analysis section. I included a few new checkpoints there.  I also 
scrutinized all the checkpoints and re-wrote the ones that were not 
verifiable throughout the document (as per my action item).  I've included 
the suggested changes in-line in this e-mail with the following 
codes.  Anything in blue are my suggested additions/changes to the 
document.  Anything in fuchsia are my suggested deletions to the document.

Hope this is helpful.

Mark

2. Guidelines
Guideline 1. Analyze the specification(s).
As with any product testing, the first step should be to analyze the 
subject. The better initial analysis can be done, the easier will be to 
design the test suite.
Checkpoint 1.1. Create a list of all the specifications used or referenced. 
[Priority2]
Most if not all of the specifications use notions and behaviors defined in 
other technical documents. For example, even base specification like XML 
uses definitions from specifications like URN and URI Syntax, Media Types, 
Unicode, etc. Some specifications are more self-contained and makes only 
limited use of the syntax defined in other specifications. Other 
specifications like XSLT [@@LINK] heavily relies on the syntax and 
semantics defined in the XPath [@@LINK] specification. Building the tree of 
the referenced specifications helps
·       to understand relationships between them
·       asses the testing work already done for the referenced 
specifications and reuse test materials where needed
·       assess the risks taken by the use of the specifications that have a 
lot of flaws like are contradictory statements or ambiguities.
[dd] I'd actually propose that this be priority 1, as interdependencies 
between specifications are vital to test, especially for API specifications 
and base specifications, such as XML, which is referenced by many others.
[wg] Discussed what specifications this checkpoint was meant to include. 
Agreed that it meant those being tested AND those that were referenced by 
the testing, and that when you tested you assumed the dependencies worked 
correctly. Also discussed that Issue13 (testing multiple specifications) 
needed to be resolved, but that either way it should not affect this 
checkpoint. Some clarification to be added, including why the list was 
important.

New Checkpoint: Define the target set of specifications that are being tested.
The target set may include more then one specification, depending on how 
strongly does the primary specification under testing rely on the 
referenced specifications. For example, XML test suite may not include 
tests that specifically test the URN format, but XSLT and XQuery test suite 
will include many tests for XPath functions.

Checkpoint 1.2. Extract test assertions from the target set of 
specifications. [Priority1]
Once you defined the target set of specifications that you're testing, more 
formal analysis should be done for each of them. The target set may include 
more then one specification, depending on how strongly does the primary 
specification under testing rely on the referenced specifications. For 
example, XML test suite may not include tests that specifically test the 
URN format, but XSLT and XQuery test suite will include many tests for 
XPath functions. (This is deleted because this text is included under the 
new checkpoint above)



The QA Specifications Guidelines recommend to produce a set of test 
assertions for specification, so you may have them already. List of test 
assertions is necessary to focus testing. [KG] Need definition of the test 
assertion to be referenced.
[WG] Needs to be clarified to indicate it is related to only the target 
specification (or specs based on issue 13)

New Checkpoint: Group test assertions by levels, profiles, modules, etc.

The conformance criteria, with respect to subsetting the specification , 
may include various degrees of variability (e.g., levels, profiles, 
modules).  The test assertions should be grouped according to these subsets 
to allow the subsets to be tested as a whole.

Checkpoint 1.3. Define those test assertions that are part of conformance 
criteria [Priority1]
Depending on the Conformance Criteria defined in the specification, not all 
of the test assertions are necessary to satisfy in order to be conformant 
to the specification. For example if the conformance criteria requires 
implementer to comply with only those assertions that have "MUST" or 
"SHALL", all other test assertions (with "SHOULD", "MAY",etc) do not belong 
to conformance criteria.
Moreover, conformance criteria may define levels of conformance, in which 
case test assertions should be grouped by those levels. (This is being 
deleted because it is included under the new checkpoint0.
[WG] Discussed if, based on the wording, we were creating assertions that 
might not be used to test conformance. Agreed that conformance levels were 
possible but that this section needed to be clarified to indicate how to 
correctly group the test assertions.
Checkpoint 1.4. Extract all the discretionary behaviors defined in the 
specification [Priority1]
Test suite should accommodate discretionary behaviors to be used to test 
products depending on the vendors choice among the allowed behaviors. 
Therefore, if the discretionary behaviors are not identified in the 
specification already, tester should do that.

(I would eliminate this checkpoint since the discretionary behaviors should 
have all been identified in the spec)
Checkpoint 1.5. Identify optional behaviors in the specification [Priority2]
For example, protocol bindings.
(Again, this should have been identified in the spec)

Checkpoint 1.6. Identify behaviors that are unintentionally undefined or 
defined ambiguously in the specification [Priority2]
(The tester can’t know the intent of the spec developer)
[DM]Ideally, the specs have no vagueness. Test developers can identify the 
"known" areas of vagueness, but must synchronize with errata. Scanning the 
documents to detect vagueness and cataloging feedback sent to the spec 
editors are two different things; which would this checkpoint espouse?
Although the spec ideally should not have such flaws, one can never 
guarantee it. Maintaining a list of such issues helps in both: fixing 
specifications in next revisions and cataloging the incoming tests that 
falls into undefined or ambiguously defined areas.

New Checkpoint: Contact specification developers to ensure that vague or 
ambiguous requirements are rewritten.

After these requirements are re-written, tests can be written to check for 
conformance.

Checkpoint 1.7. Identify explicitly undefined behaviors in the 
specification [Priority2]
Although it is not a recommended practice, a specification's authors may 
explicitly abstain from defining product behavior in certain circumstances. 
List of such areas helps to analyze incoming tests appropriately.
Checkpoint 1.8. Identify contradictory behaviors in the specification 
[Priority2]
Should not be there, but if exists, needed for test analysis.

New Checkpoint: Contact specification developers to ensure that 
contradictory behaviors are rewritten
After the contradictory behaviors are rewritten, tests can be developed to 
check for conformance.

[WG] Okay, but also discussed that there are a number of terms that need to 
go into the Glossary for this to all be clear.
Checkpoint 1.9. List user scenarios for the specification [Priority2]
User scenarios help keep the tests focused.
[dd] Again, I'd go for moving this up to priority one. It is especially 
important for non-technical oriented specifications, such as WAI.
Guideline 2. Define testing areas
Testing area is a set of rules described in the specification that tester 
groups together based on some commonality.
Checkpoint 2.1. Define target areas for testing [Priority1]
Needed mainly for categorization of the tests in the test suite. Usually 
the testing areas match the specification areas/content, but sometimes it 
is easier to define them based on some other criteria, like applicable 
testing methodology, user scenarios, etc. If there is no 1:1 mapping 
between test areas and specification areas/content, relationship between 
tests and specification should be still traceable via test to test 
assertions mapping described in the checkpoint below.
[WG] Needs examples and to be discussed as it relates to levels, modules 
and profiles.
Checkpoint 2.2. Prioritize testing areas [Priority2]
Helps to prioritize testing development.
[WG] Needs to be moved to someplace under the guideline on test 
development, and should provide examples of different criteria you could 
use to prioritize the tests.
Checkpoint 2.3. For each testing area, produce a set of sample testing 
scenarios [Priority3]
Before creating test cases for certain area of the specification, it may be 
useful to design a set of sample testing scenarios, based on the user 
scenarios. Those are not actual tests, but rather examples. This helps to 
properly select testing framework, create templates for test cases, define 
future sub areas
[WG] Needs examples and some clarification.
Checkpoint 2.4. Map sample testing scenarios to test assertions, 
discretionary behaviors and vague areas [Priority3]
Helps to formalize testing scenarios and provides basis for the future 
analysis of the specification coverage.
Guideline 3. Choose the testing methodology
[dd] This guideline covers the area of test framework, something I 
anticipate will be covered elsewhere. Here, only substantial issues 
relevant to incorporating existing frameworks, and not altering them, 
should be raised.
[dd] Available and applicable methodologies need to be given here.
[WG] This guideline is meant to be a high level approach on how to build 
tests, and needs to be updated to make this clear.
Checkpoint 3.1. For each test area identify applicable testing approach 
[Priority1]
[dd] As above, list available ones.
By testing approach we understand a set of high-level 
methods/ideas/strategies. It is convenient to define test areas so that 
testing in a single area can be done using single methodology.
[WG] Discussed if this meant choose from a list or define how. Agreed it 
should be define how. To clarify.
Checkpoint 3.2. Reuse publicly available testing techniques if applicable 
[Priority1]
Rewritten Checkpoint 3.2: Identify publicly available testing 
techniques.  Reuse publicly available testing techniques if applicable. 
List the publicly available testing techniques that have been reused. 
[Priority1]
Note: This checkpoint has been re-written because it could not be verified
It is critical to avoid "reinventing the wheel" from both resource 
considerations and future integration perspectives.
[WG] Both 2.1 and 2.2 need to be reworded, as neither is verifiable or 
actionable.
Guideline 4. Provide the test automation and framework
Once the test strategy is defined, the right choice of the framework is 
critical for smooth future test development/use
[dd] A general point: in the checkpoints connected to this guideline there 
seems to be a choice between frameworks. Ideally, QA WG will produce a 
small number of test frameworks that will implement most, if not all, of 
the options mentioned. For clarification, it is not to be understood that 
the QA WG will produce test frameworks for all WG's; there are however a 
series of things that should be in synchronization, most notably reporting, 
result publication and test extraction (if it is done using the 
specification granularity we speak of in Specification Guidelines). To 
allow for this, I would propose that this guideline be reorganized as 
follows:
·       Use existing framework, if applicable
o       List options (reporting, automated tests and so forth), basically 
following he existing checkpoints
·       If there is no existing framework that fits, apply (same) 
checkpoints to produce a suitable test framework
Checkpoint 4.1. Review available test frameworks, automation and adopt 
existing if applicable [Priority1]
Rewritten Checkpoint 4.1. Review available test frameworks, automation and 
adopt existing if applicable Identify available test frameworks used.  If 
none, justify why new frameworks are needed, and existing ones could not be 
used. [Priority1]
Note: This checkpoint has been re-written because it could not be verified

[dd] This checkpoint is partly inconsistent with the immediately previous 
one; reordering them might help.
Argumentation is the same as in reusing testing methodology
Checkpoint 4.2. Ensure the framework and automation are platform 
independent. [Priority1]
Rewritten Checkpoint 4.2. Ensure the framework and automation are platform 
independent.  Demonstrate on 3 platforms. Ensure that the  framework and 
automation are built using open standards.   [Priority1]
Note: This checkpoint has been re-written because it could not be verified
Alternative is to provide an implementation of the framework for every 
platform.
Checkpoint 4.3. Ensure the framework and automation are applicable to any 
product/content that implements the specification [Priority2]
Rewritten Checkpoint 4.3. Ensure the framework and automation are 
applicable to any product/content that implements the 
specification.  Demonstrate with three products/contents. Ensure that 
the  framework and automation are built using open standards.  [Priority2]
Note: This checkpoint has been re-written because it could not be verified
Similar to the previous one, test suite should be able to cover all 
products that specification allows.
[dd] The two previous checkpoints could be given in one, stating that the 
testing framework chosen should, if possible, be platform independent. 
Also, we need to keep in mind that providing platform-specific test 
frameworks raises issues with adding work that needs to be done to the 
testing framework itself; if we were to provide platform-specific test 
framework, the QA WG or any party producing those would need to allocate 
time to produce and ascertain their quality.
Checkpoint 4.4. Ensure the framework makes it easy to add tests for any of 
the specification areas [Priority2]
Rewritten Checkpoint 4.4. Ensure the framework makes it easy to add tests 
for any of the specification areas.  Demonstrate, through an example, how 
tests are easily added to a specification area. [Priority2]
Note: This checkpoint has been re-written because it could not be verified
Test suite will expand over time, and eventually cover all areas of the 
specification.
Checkpoint 4.5. Ensure the ease of use for the test automation [Priority1]
Rewritten Checkpoint 4.5. Ensure the ease of use for the test 
automation.  Document how the test automation is easily used.  [Priority1]
Note: This checkpoint has been re-written because it could not be verified
Usability is critical requirements of the test suite. But as critical is to 
ease the tests contribution.
Checkpoint 4.6. Ensure the framework allows for specification versioning 
and errata levels [Priority2]
Rewritten Checkpoint 4.6. Ensure the framework allows for specification 
versioning and errata levels. Explain how specification versioning and 
errata levels are accommodated by the specification. [Priority2]
Note: This checkpoint has been re-written because it could not be verified
Requirement from the Process guidelines.
Checkpoint 4.7. Ensure the framework accounts for choices allowed for 
discretionary behaviors in the specification [Priority3]
Rewritten Checkpoint 4.7. Ensure the framework accounts for choices allowed 
for discretionary behaviors in the specification.  Explain how 
discretionary behaviors are accommodated by the framework.  [Priority3]
Note: This checkpoint has been re-written because it could not be verified
This is integral part of the test suite to be applicable to any product 
allowed by the specification.
Checkpoint 4.8. Ensure the framework allows for tests for optional 
behaviors defined in the specification [Priority3]
Rewritten Checkpoint 4.8. Ensure the framework allows for tests for 
optional behaviors defined in the specification.  Explain how optional 
behaviors are accommodated by the framework. [Priority3]
Note: This checkpoint has been re-written because it could not be verified
While optional behaviors are not necessary to implement, some of them might 
be self contained additions (like protocol bindings), that needs a test 
suite themselves. These tests will of course be applicable only to those 
products that claims to implement optional behaviors/profiles.
[dd] Experience from the DOM TS shows that allowing for optional/multiple 
behaviors is a high priority on the wish list for the TS. Implementers want 
to be able to test particular behaviors as defined in the specification, 
especially as they may have chosen to support only parts of the 
specifications (eg. DOM builds on XML, which allows for entity 
expanding/entity preserving applications).
Checkpoint 4.9. Ensure the framework accommodates levels of conformance 
defined in the specification [Priority1]
Rewritten Checkpoint 4.9. Ensure the framework accommodates levels of 
conformance defined in the specification.  Demonstrate how the 
framework  allows tests to be filtered by levels. [Priority1]
Note: This checkpoint has been re-written because it could not be verified

If the conformance criteria introduces levels, test framework should allow 
to filter tests by levels.
Checkpoint 4.10. Ensure the results verification is product independent 
[Priority1]
Rewritten Checkpoint 4.10. Ensure the results verification is product 
independent.  Demonstrate results verification on 3 different 
products.  [Priority1]
Note: This checkpoint has been re-written because it could not be verified
Results verification is critical part of the test framework. Since the test 
should run on any platform against any product implementing the spec, 
results verification (for example, comparison against expected output) 
should be product independent.
Checkpoint 4.11. Ensure the framework allows to document the tests 
[Priority2]
Rewritten Checkpoint 4.11. Ensure the framework allows the tests to be 
documented. Explain how to document the tests, within the framework. 
[Priority2]
Note: This checkpoint has been re-written because it could not be verified

For better maintenance. This includes annotating tests with pointers to the 
original specification(s).
Checkpoint 4.12. Ensure the framework has proper test case management 
[Priority3]
Rewritten Checkpoint 4.12. Ensure the framework has proper test case 
management.  Demonstrate how at least one of the following test case 
management functions are accomplished, within the framework: managing 
additions; managing removals; filtering by various criteria. [Priority3]
Note: This checkpoint has been re-written because it could not be verified
Test case management includes accounting system for tests, managing 
additions, removal, filtering by various criteria.
Checkpoint 4.13. Ensure the framework allows to measure specification 
coverage [Priority2]
Rewritten Checkpoint 4.13. Ensure the framework allows specification 
coverage to be measured. Demonstrate the above by mapping a list of tests 
to the list of test assertions, grouped by areas.  [Priority2]
Note: This checkpoint has been re-written because it could not be verified
One effective way to measure the specification coverage is to map list of 
tests to the list of test assertions grouped by areas.
[dd] Absolutely; this way of grouping works fine with the way we have 
discussed modules of specifications.
Guideline 5. Provide the results reporting framework
WG should encourage vendors to report testing results for their products. 
In order to do that, a WG needs to provide vendors with the results format, 
necessary stylesheets, etc.
Checkpoint 5.1. Review available results reporting frameworks and adopt 
existing if applicable. [Priority1]
Rewritten Checkpoint 5.1. Review available results reporting frameworks and 
adopt existing if applicable. If existing frameworks are not adopted, 
explain why.  [Priority1]
Note: This checkpoint has been re-written because it could not be verified

Checkpoint 5.2. Ensure the results reporting is platform independent 
[Priority1]
Rewritten Checkpoint 5.2. Ensure the results reporting is platform 
independent.  Demonstrate on 3 platforms.  [Priority1]
Note: This checkpoint has been re-written because it could not be verified



Similar to the tests, results reporting should be usable by any vendor.
[dd] As above, given that the testing framework is uniform in functionality 
and independence, this will have been dealt with elsewhere.
Checkpoint 5.3. Ensure the results reporting is compatible with the test 
framework [Priority1]
Checkpoint 5.4. Ensure the ease of use for results reporting [Priority1]
Rewritten Checkpoint 5.4. Ensure the ease of use for results 
reporting.  Demonstrate that  the results reporting has sorting and 
filtering capabilities.   [Priority1]
Note: This checkpoint has been re-written because it could not be verified

Necessary to facilitate the results reporting by vendors. Ensure the 
results reporting has sorting and filtering capabilities, etc.
Checkpoint 5.5. Ensure the results reporting allows for specification 
versioning and errata levels [Priority2]
Same as in tests.
Checkpoint 5.6. Ensure the results reporting allows to export results in 
self-contained format suitable to publish on the web [Priority2]
Rewritten Checkpoint 5.6. Explain how the results reporting allows results 
to be exported in a self-contained format suitable to publication on the 
web [Priority2]
Note: This checkpoint has been re-written because it could not be verified
Checkpoint 5.7. Ensure the results reporting provides details on failures 
sufficient to investigate [Priority3]
Rewritten Checkpoint 5.7. Demonstrate that the results reporting provides 
details on failures sufficient to investigate [Priority3]
Note: This checkpoint has been re-written because it could not be verified

Logging for example.
Checkpoint 5.8. Ensure the results reporting allows for history/storing 
analysis comments [Priority3]
Rewritten Checkpoint 5.8. Demonstrate how the results reporting allows for 
history/storing analysis comments [Priority3]
Note: This checkpoint has been re-written because it could not be verified
To investigate/compare the different versions of the product.
Guideline 6. Organize tests development
Checkpoint 6.1. Start with the test suite prototype and publish it. 
[Priority2]
Checkpoint 6.2. Start with atomic tests first, according to priorities 
defined in Ck2.5 [Priority1]
Rewritten Checkpoint 6.2. Start with atomic tests first, according to 
priorities defined in Ck2.5.  Provide documentation describing the atomic 
tests. [Priority1]
Note: This checkpoint has been re-written because it could not be verified

Checkpoint 6.3. Conduct regular public reviews of the test suite as 
specification and test suite development continues [Priority2]
Rewritten Checkpoint 6.3. Conduct regular public reviews of the test suite 
as specification and test suite development continues.  Provide the 
schedule for the public reviews. [Priority2]
Note: This checkpoint has been re-written because it could not be verified



[dd] Ideally yes, but this will not necessarily the case if the TS will be 
produced within the WG
Checkpoint 6.4. Conduct regular specification coverage analysis. [Priority2]
Rewritten Checkpoint 6.4. Conduct regular specification coverage analysis. 
Provide the schedule for specification coverage analysis. [Priority2]
Note: This checkpoint has been re-written because it could not be verified
Guideline 7. Conduct testing
Checkpoint 7.1. A Working Group must publicly encourage conformance testing 
among vendors. [Priority1]
Rewritten Checkpoint 7.1. A Working Group must publicly encourage 
conformance testing among vendors. Organize at least one face-to-face 
meeting with vendors to review the test suite and encourage 
testing.  [Priority1]
Note: This checkpoint has been re-written because it could not be verified
A common practice is to support public discussion group dedicated to the 
test-suite, organize f2f meetings for vendors.
[dd] And other interested parties.
Checkpoint 7.2. Vendors to publish test results for their products. 
[Priority3]
Rewritten Checkpoint 7.2. Encourage Vendors to publish test results for 
their products by reserving a special space where information pertaining to 
test results can be maintained.. [Priority3]
Note: This checkpoint has been re-written because it could not be verified

[dd] It may be that the W3C can have a special space where information 
pertaining to test results can be given, if not explicitly, then using 
links to those pages where the information can be found (in order not to 
have to provide disclaimers).


****************************************************************
Mark Skall
Chief, Software Diagnostics and Conformance Testing Division
Information Technology Laboratory
National Institute of Standards and Technology (NIST)
100 Bureau Drive, Stop 8970
Gaithersburg, MD 20899-8970

Voice: 301-975-3262
Fax:   301-590-9174
Email: skall@nist.gov
****************************************************************
Received on Monday, 8 July 2002 17:55:25 UTC