Re: Test case review from James Graham on 2011-05-11 (public-test-infra@w3.org from April to June 2011)

From: James Graham <jgraham@opera.com>
Date: Wed, 11 May 2011 09:29:17 +0200 (CEST)
To: "Linss, Peter" <peter.linss@hp.com>
cc: "public-test-infra@w3.org" <public-test-infra@w3.org>
Message-ID: <alpine.DEB.2.00.1105110906550.21605@sirius>

On Tue, 10 May 2011, Linss, Peter wrote:

>
> On May 10, 2011, at 2:15 PM, James Graham wrote:
>
>> No, I undersood. I still don't understand why we care. As far as I know
>> the only use for saving old test results is regression tracking. Why do we
>> need to delete the results of specific tests rather than discard the whole
>> results set?
>
> Ok, the use case is for storing historical snapshots (as well a a 
> general philosophy of not trowing away data).
>
> Let me give a concrete example. The CSS wg developed the CSS 2.1 test 
> suite to the point that we felt it was good enough to transition to PR 
> (this is our RC6 version of the test suite). Since then, we've found 
> issues with tests and we've found areas where testing coverage of the 
> spec could be improved. There are also known issues with CSS 2.1 that 
> were deferred to errata and there may be future testing changes to help 
> clarify issues.

[...]

So if you want to keep all data then you don't need to automatically 
discard results for tests that have changed. So I don't really follow your 
argument here.

An argument that would make sense is "we want to make it easy to do diff 
runs of the testsuite so that when it is updated we can only run the 
changed tests rather than all of them". I can see why this would seem good 
for the CSS 2.1 testsuite because it is a huge undertaking to run the 
whole thing. However going forward we should regard testsuites that 
require significant manual work to run as unacceptable because we should 
focus on making tests that vendors can run on a day-to-day basis. Once 
people are running all the tests muliple times per day as a matter of 
course doing one more run because the tests changed is not hard.

>> There is a big difference if you think in terms of commits rather than in
>> terms of files. If I merge a series of commits into another branch I can
>> be sure that I got all the relevant changes and no more. Since, in my
>> system, a single review request would be an atomic change e.g. the series
>> of commits needed to add one testsuite, taking all the commits for a
>> specific review and merging them onto the approved branch would give you a
>> good assurance that you got the bits that had been reviewed but no more or
>> less.
>
> The problem still lies in, which series of commits? There may still be 
> other commits to test assets that aren't obviously related to other 
> commits. We either have a single monolithic collection of tests and 
> assets (which we already know doesn't work), or we need to manage the 
> relationship between the components, or we have a system that sometimes 
> breaks tests.

So, fundamentally all I am proposing is that we use the version control 
system in a mode it is explicitly designed to support, with an unstable 
branch and a stable branch. This is a rather common setup and in my 
experience it doesn't cause huge problems of the type you describe.

In general I much prefer getting full leverage out of our existing tools 
rather than spending time designing and implmenting complex bespoke 
solutions that may or may not work better. At the very least it seems 
prudent to try the cheap approach first before abandoning it for the 
expensive one.

Received on Wednesday, 11 May 2011 07:29:49 UTC