Re: CSS Test Suite Management System Now Live from Linss, Peter on 2011-09-22 (public-css-testsuite@w3.org from September 2011)

From: Linss, Peter <peter.linss@hp.com>
Date: Thu, 22 Sep 2011 16:40:39 -0700
To: fantasai <fantasai.lists@inkedblade.net>
Cc: "public-css-testsuite@w3.org" <public-css-testsuite@w3.org>
Message-Id: <EE582DF0-BF9C-4120-973D-3DAAF7E59A3D@hp.com>

On Sep 22, 2011, at 1:51 PM, fantasai wrote:

> On 09/07/2011 07:22 PM, Linss, Peter wrote:
>> Having survived the initial beta test, I'm pleased to announce the new CSS Test Suite Management System (code named 'Shepherd') is now online and ready for use.
>> 
>> As always, if you find any bugs, please email me immediately.
> 
> I think it would be useful to split "Needs Work" into more severe (the test is invalid)
> and less severe cases (the test is valid, but could be better). In the latter case, we'd
> want to unhook the test from the test results and reporting harness. From Gérard's wiki
> list of issues, I see these major groups of Needs Work:
> 
>   Needs Work - Incorrect  /* The test is wrong and should not be passed or doesn't test what's claimed. *
>   Needs Work - Metadata   /* The test metadata needs correction or improvement. */
>   Needs Work - Usability  /* The test is confusing or hard to judge. */
>   Needs Work - Precision  /* The test is imprecise and may give false positives. */
>   Needs Work - Format     /* Syntax errors, format violations, etc. */

I initially had two levels of 'Needs Work' but decided to keep it to one so that it's easier to search for tests that need work.

My thoughts were that the reason why it needs work should simply be stated in the comment. 

There is a status level for tests that shouldn't be part of the build (and therefore removed from the harness) and that's 'Rejected', meaning that the test should be removed rather than fixed. 

While the harness and Shepherd don't talk to each other (yet), the harness does have a notion of tests reported as invalid, they're still presented as part of the suite and listed in results, but they get de-prioritized in testing order and counted separately in the reports. I would think a test that needs work for any of the reasons listed above should fall into that category as the results shouldn't be trusted (except for really minor issues like typos in the metadata).

I do see the usefulness of having a "this test is ok, but could use a little tweak" vs "this test is broken and needs work before relying on the result".

I could either add more status levels or add a separate field for the severity levels of 'Needs Work', but both approaches add overhead. More status levels makes searching for all tests that need work (for whatever reason) more of a pain, a separate field is one more thing to fill in when entering issues. To me the only benefit to having the extra levels is for searching and organizing, but do we really need that?

To me this falls under the category of "let's wait and see where the pain is as people use the system"...

Peter

Received on Thursday, 22 September 2011 23:57:12 UTC