W3C home > Mailing lists > Public > public-css-testsuite@w3.org > June 2015

Re: [Test harness] Background-color code of row for test result and conflicting test results

From: Linss, Peter <peter.linss@hp.com>
Date: Wed, 24 Jun 2015 20:46:20 +0000
To: Gérard Talbot <css21testsuite@gtalbot.org>
CC: Public CSS Test suite mailing list <public-css-testsuite@w3.org>
Message-ID: <CEFFC89A-B896-4EC7-98F9-69F881A3A0CC@hp.com>

On Jun 5, 2015, at 11:03 AM, Gérard Talbot <css21testsuite@gtalbot.org> wrote:

> Peter,
> 
> http://test.csswg.org/harness/results/css-writing-modes-3_dev/grouped/table-column-order-002/
> 
> The background-color of the table-column-order-002 test is light green indicating that "two or more passes".
> 
> But when we look at individual test results:
> 
> http://test.csswg.org/harness/details/css-writing-modes-3_dev/table-column-order-002/
> 
> we see that a wide majority of test result input from people taking such test is "FAIL".
> 
> First of all, I was the tester with Firefox Linux x86_64 with source "184.160....".
> 
> When you do a lot of tests with the test harness, fatigue, tedious repetitive looking can certainly make it possible to do 2 or 3 wrong input of test results out of 100. I think this is also what happened to Koji on 2015-03-21 00:22:00 EDT
> 
> http://test.csswg.org/harness/details/css-writing-modes-3_dev/table-column-order-002/
> 
> because we can see 2 contradictory test results for the same browser (Chrome 41.0.2272.89) on the same day by the same person. The possibility of a wrong test input is increased for Chrome browser because we have to first click a bookmarklet that will prepend, suffix "-webkit-" string in front of writing-mode properties. If we forget doing so, then layout of test and test results are wrong.
> 
> What I am trying to say here is: when a clear majority of test results says "FAIL", I think the row color code should be updated accordingly.


I'm reluctant to change the algorithm there, the primary purpose of the report page is to determine when CR exit criteria have been met. If there are 100 fail results from older browsers, but one pass from the latest version, that's a pass and needs to be interpreted as a pass.

I do accept that people make mistakes and that sometimes bad information can get entered. So I made two changes to the harness:

First, if the same person enters a new result for the same test, using the same browser, then the harness will automatically remove older results that were entered within the last 12 hours. So if someone makes a mistake, they can simply re-run the test within 12 hours and replace the result.

Second, some users (with higher privileges) now have the ability to delete individual results. On the details page (click the test name or results on the results page) there is now a button (an X) that if clicked, will delete that result.

Hopefully this helps.

Peter


Received on Wednesday, 24 June 2015 20:48:25 UTC

This archive was generated by hypermail 2.4.0 : Friday, 20 January 2023 19:58:21 UTC