Re: svn vs hg from Linss, Peter on 2011-05-09 (public-test-infra@w3.org from April to June 2011)

From: Linss, Peter <peter.linss@hp.com>
Date: Mon, 9 May 2011 15:55:42 -0700
To: James Graham <jgraham@opera.com>
Cc: "public-test-infra@w3.org" <public-test-infra@w3.org>
Message-Id: <60F5924F-652B-4A00-A122-1BC8FBF16802@hp.com>
On May 9, 2011, at 2:58 PM, James Graham wrote:

> On Mon, 9 May 2011, Linss, Peter wrote:
> 
>> Shepherd is designed to be a web interface tightly integrated with our 
>> test suite repository. It'll facilitate reviewing, approving, and bug 
>> tracking of the test files as well as adding a query and editing system 
>> for the test case metadata. There are plans to also allow some degree of 
>> direct creation and editing of the tests in the web ui. It will also 
>> manage the layout of the test source files within the repository and 
>> integrate with our build system.
> 
> That sounds interesting. Are there details anywhere? How tied is it to 
> CSS-specific assumptions (one tets per file, metadata embedded in the 
> test, etc.)

There are some notes on our wiki at:
http://wiki.csswg.org/test/review-system

note these are old and out of date in some areas (like where it talks about using the w3c cvs repo).

The current state of the design is mostly in my head at the moment (and the scope of the system has grown a bit).

We talked a bit today about the needs of other groups to have meta data outside the test file and how some test files can contain multiple tests. I believe those needs can be folded in to our systems fairly straightforwardly.

For example, our test harness doesn't look at the contents of the test file at all, it relies on a manifest file that contains all the metadata which in turn is produced by the build system. It would be a trivial change for the build system to look for a parallel file to extract the metadata from.

Regarding multiple tests in a file, our harness already has the concept of "combo" tests, where a parent test case contains the sum of individual feature tests that live in child test cases. If a combo test is passed, all the children are considered passed, and vice versa. It should also be straightforward to extend the metadata to indicate that a test file is a "virtual combo" that produces a set of child results. The harness also has separate tables for test cases and the urls to the test cases (because a single test case can exist in multiple formats), there's no reason multiple test cases can't have the same url. This probably only makes sense for script tests though...

I tried to build our harness to be as flexible as possible, it has a lot of capabilities under the hood that aren't readily apparent in the UI. The docs for it on our wiki are really out of date, I'm going to be updating that soon.

> 
>> My current dilemma is the choice of source repository used for the test 
>> suite. We've been using svn and I understand that this group has decided 
>> to use hg.
>> 
>> I'm not trying to restart a whole svn vs hg debate here, but it would be 
>> very helpful for me to understand the reasons that this group has 
>> decided to use a distributed vcs vs a centrally managed one.
> 
> Fundamentally a testsuite that is being developed and used by a number of 
> participants is a distributed system. Having a centralised version control 
> system doesn't make any sense. For example it is likely that every browser 
> vendor that runs the tests will want to add local patches to their local 
> copy of the repository e.g. in order to make result reporting work 
> with their specific test harnessses. Having to maintain these patches by 
> hand is relatively lots of effort; putting them on a branch in the 
> vendor-local clone and merging upstream changes onto that branch is a 
> relatively straightforward way to solve the problem.

Ok, basically what I presumed.

I get that vendors are all using their own home-grown systems, and integrating those with the official W3C test suites is a problem we've been wrestling with for years already in CSS. Personally, I'd like to see the W3C test systems be robust enough that they cover the internal needs of the vendors and they can simply leverage them directly (with perhaps reasonable extension hooks for truly one-off uses). Ideally the differences between the official test systems and the internal testing systems should approach zero over time. But I'm well aware that's a long way off and we have to have a way to get there from here.

> 
> Vendors and individuals may also want to develop outgoing tests with all 
> the benefits of using a VCS internally before pushing to the wider world.
> This might not be critical when developing a small number of individual 
> tests but becomes critical when developing larger testsuites for complex 
> features.

While I _really_don't want to start a philosophical debate about distributed vs central, there are counter arguments that using dvcs places a higher burden on the end user and there really should be a tangible benefit to outweigh that burden. (People who haven't been using them already often may have difficulty wrapping their head around a dvcs system, I've seen many experienced geeks get lost with them, and many of our test authors are new to vcs entirely...)

I get the advantages to dvcs, but if most users don't need that, should everyone really have to deal with it?

I suppose the right answer there is in the tool set wrapped around the dvcs, if we do our jobs right it'll still be easy for the novice end user and the flexibility will be there for those that can use it.

> 
>> Now, I certainly see the advantages to aligning the test suite tools on 
>> a common technology infrastructure. And Shepherd is early enough in the 
>> development stages that it'll be much easier for me to switch to hg now 
>> than down the road some time. So far, I'm not seeing any reasons why hg 
>> wouldn't work for our system, but it'd still have to have the notion of 
>> a central authoritative repository with a given directory structure and 
>> several policies enforced throughout the repository (like naming 
>> conventions, avoiding naming collisions, meta data handling, etc). These 
>> policies will be enforced by commit (or push) hooks on the authoritative 
>> server.
> 
> There is nothing about using a DVCS that implies there can't be a 
> particular clone that is treated as canonical for a given purpose. It is 
> reasonable to assume that the clone on dvcs.w3.org will be treated as 
> the official repository and can be used as such.

Sure, my point is that my using these tools (Shepherd, build system, harness, etc), we're also creating a set of restrictions on what's in the canonical repo and how it's stored and organized. I just wanted to be sure this still makes sense in a dvcs world as it was intended to be used by this group.

> 
>> What I'm trying get at here is why hg was chosen. Was it just because 
>> it's the vcs flavor of the month, seen as more modern, etc or was there 
>> some kind of practical need for a distributed vcs and if so, does that 
>> imply a usage pattern that is fundamentally incompatible with our 
>> Shepherd system and suite management philosophy?
>> 
>> If Shepherd and hg can play nice, then I'm ok with convincing the CSS wg 
>> that we need to switch to hg and building our tools around that (but if 
>> I'm going to do that, I need to make that call now).
> 
> I strongly believe that converging on a common dvcs (in particular 
> mercurial since that is already in use) is a good thing.

Agreed.
Received on Monday, 9 May 2011 22:56:06 UTC