Re: Repository layout from Linss, Peter on 2012-06-01 (public-test-infra@w3.org from April to June 2012)

From: Linss, Peter <peter.linss@hp.com>
Date: Fri, 1 Jun 2012 13:49:06 +0000
To: Robin Berjon <robin@berjon.com>
CC: James Graham <jgraham@opera.com>, "<public-test-infra@w3.org>" <public-test-infra@w3.org>
Message-ID: <552A3D85-2302-498E-A15C-B72D3AEAE9A6@hp.com>
On Jun 1, 2012, at 5:13 AM, Robin Berjon wrote:

> On May 31, 2012, at 17:26 , Linss, Peter wrote:
>> On May 31, 2012, at 6:53 AM, Robin Berjon wrote:
>>> Additionally, it is causing regular problems with the Test Framework tool. Currently, submitted test suites tend to be imported into it so that people can run them, but when tests are moved around the DB entries are not updated either for the approved or for the submitted suites. Changing this will not make this problem entirely go away since it's possible that tests will still be removed, but it ought to make this breakage much rarer.
>> 
>> My guess here is that the manifests are probably not being used properly (though I haven't really looked at the particular problem you're describing). The test case import should be removing tests not found in a newer version of the manifest from the suite, although the tests themselves and their results will be preserved in the DB (by design). This way if the test shows up in a different suite (or the same one later) the old results are still there.
> 
> I know, but the problem is that since people are treating the file system layout as a meaningful-enough classification for tests, they're not bothering to update the manifest and resubmitting it. Ideally the metadata would be intrinsic to the test (externalised, authoritative metadata is almost always a bad idea) and the framework would automatically know of updates. Otherwise things are bound to get out of synch.

Well, that's the way the system was designed. If people don't update the inputs to the system it will always have bad data. 

The test framework was meant to be more general purpose and not dependent on the layout of the repository, or the contents of the test files. All its inputs are through manifest files.

Shepherd on the other hand, sits directly on top of the repository, understands the file layout (that bit is configurable) and reads the metadata directly from the test files (using the build system library to interact with the tests and repository file layout). Shepherd will also have the ability to edit metadata and even test source through the web UI soon.

The build system is intended to produce the manifest files for the framework as well as perform format conversions of test source (XHTML<->HTML, including SVG and MathML embedded in HTML5, as well as XHTML/HTML->XHTML-Print), and produce human readable suite index files. The build also performs some degree of validation (as does Shepherd) and will soon inject vendor prefixes as needed (keeping prefixes out of test source). The build will also shortly be able to aggregate sources and sort them into the proper test suites, for example when a test is bound to multiple specifications it will get copied into multiple suites.

On csswg.org, we run the build nightly and automatically re-import the manifests into the framework. All a test author has to do is push a test into the repository and it simply shows up in all the right places automatically.

> 
>> My understanding is that most of the suites aren't using the revision information for the tests properly either. This is another field that the build system populates properly when it generates the manifests.
> 
> No, they're definitely not, because most test suites do not require a build step.

Right, but eventually they should be using the build system to produce the manifests at least. For small suites that don't change often, manifests could be produced by hand, but that gets unmaintainable fast.

> 
>> Even for those test suites where a build step isn't necessary, the build tools can still be useful for generating the proper manifest files...
> 
> But you still need to generate them, and upload them. We're not getting enough tests and enough reviews as it is, every step that is added — no matter how small — decreases our success rate.

Which is why the process should be automated, as we did in CSS.

> 
>>> The test framework supports flags. We could certainly ask people to add a "reviewed" flag when they actually review a test (alternatively we can detect this with the <link rel=reviewer> convention). That should provide us with enough information to do whatever we need to do, e.g. generate a suite run containing only reviewed tests if someone needs that for some reason.
>> 
>> Please take a look at Shepherd[1], it already uses the <link rel=reviewer> metadata to mark tests as approved.
> 
> I'm aware of Shepherd, but I've only seen it in use for CSS. It's hard to know if it is well suited for other suites or not with just that view. With that in mind it would be a good idea to integrate it on w3c-test.org to see if it applies well.

That's the plan. It was designed to be generic and configurable. It's still going through enough development that the DB schema is still getting adjusted, so while we're using it for production use, I'm not sure it's stable enough to be setup on w3c-test.org as I don't have enough access to that box to keep it in sync with development. We should talk about that more in a few weeks though…

Another point. Both Shepherd and the framework were designed to have installs mapped one-to-one with test repositories (especially Shepherd, the frame work is more flexible but there's still the test name uniqueness issue). I don't think every group should be dumping all their test suites into a single install. It won't scale to cover the entire W3C with a single instance. They deployment plan needs to be mapped out better before it gets out of hand. I consider the current install of the framework on w3c-test to be more experimental/evaluation than final.

> 
>>> Overall I agree that the approval step is a bureaucratic speed bump that is not being helpful. I think that we should move to a commit-then-review model in which people who have an interest in passing a test suite can file bugs against broken tests. Ideally, we would make flagging broken tests easy — I'm thinking about ways of doing that in the framework (suggestions welcome — I wonder if I could just add a flag to "reported as broken" test cases).
>> 
>> The framework already has the support for that, though no UI (yet). A possible result is 'invalid', which marks the test as bad until it is modified.
> 
> Ah, good. I've seen invalid in the source but couldn't figure out what it did.

Peter
Received on Friday, 1 June 2012 13:50:04 UTC