RE: [www-qa] Re: Conformance and Implementations from Arnold, Curt on 2001-10-19 (www-qa@w3.org from October 2001)

From: Arnold, Curt <Curt.Arnold@hyprotech.com>
Date: Fri, 19 Oct 2001 14:27:38 -0600
To: "'www-qa@w3.org'" <www-qa@w3.org>
Message-ID: <70E215722F6AD511820A000103D141D40AA60B@thor.aeathtl.com>
Here is a slightly different path to the same situation (and probably what happened internally in the development of the permissive test).

I set out to write a test for systemId and write it using exact comparison since I have no expectation from my reading of the spec and my experience that an implementation would absolutize the URI.  

I run the test against the stable of implementations and I realize that one of the implementors thought that absolution was a reasonable thing to do.  I look at the spec again and see that all the
other processors aren't wrong (there is the possibility that the odd implementation was actually the only one that got it right) and I can see that there is no explicit prohibition or allowance for
the observed behavior.

At this point, the test writer can make the following judgements:

A) Decide that my initial expectations were flawed and that the odd behavior is clearly within the intent of the specification and the expectations of capable practitioners and rewrite the test to be
permissive.

B) Decide that the odd implementation is clearly wrong, keep the strict test and flag the odd processor as non-conformant.  In the review of the test suite, the WG would be aware that this one
processor failed the test.

C) Decide that the resolution is debatable, right both a permissive and strict test, flag the odd processor as non-conformant and raise this as an issue to the WG.

What is happening on this issue in the DOM TS is that the submitter of the permissive test thought the case fell into category A and I see it as a category C.  Unless there is consensus within the
test development group that an issue is either an A or B, it should go to the WG.  Otherwise the test development group usurps the ultimate interpretation role of the WG.  If the test falls into
either B or C, the WG and one implementor is at least aware that the test suite and one implementation are at odds.  If the test group decides that it is an A, then the issue isn't prominent to either
the WG or any implementor.

Like any appellate court, the WG would need to balance:

The literal wording of the recommendation
The expectations of a reasonable practitioner on reading the recommendation
The intent of the authors (based on WG emails, etc)
The cost of any remedy
The benefit of any remedy

In this case (in my judgement), the odd behavior is probably within the bounds of the literal wording of the recommendation, but the behavior would have been unexpected by most reasonable
practitioners (but not all since the implementation author apparently thought it was reasonable), don't know the intent of the WG, the remedy cost is small since only code that explicit depended on
this one implementations unusual behavior, the benefit is moderately positive since code written for all the other processors would work correctly when the odd processor is brought in line.

In addition to the two alternatives (strict errata, permissive errata) that I mentioned in the past letter, there could always be a permissive errata for the existing spec and an intention to tighten
the spec on the next revision (in the case where there is a lot of code that depends on a particular reading of an issue).
Received on Friday, 19 October 2001 16:29:01 UTC