Re: Consolidating css-wg and web-platform-tests repositories (Was: test suite meta data) from Peter Linss on 2013-08-06 (public-test-infra@w3.org from July to September 2013)

From: Peter Linss <peter.linss@hp.com>
Date: Tue, 6 Aug 2013 11:02:24 -0700
To: Tobie Langel <tobie@w3.org>
Cc: Rebecca Hauck <rhauck@adobe.com>, James Graham <james@hoppipolla.co.uk>, public-test-infra@w3.org <public-test-infra@w3.org>
Message-Id: <70004ED9-3E57-469D-A5AE-B656441AB7CA@hp.com>
On Aug 5, 2013, at 10:02 AM, Tobie Langel wrote:

> On Thursday, August 1, 2013 at 11:02 PM, Peter Linss wrote:
>> On Aug 1, 2013, at 1:10 PM, Tobie Langel wrote:
>>> On Thursday, August 1, 2013 at 6:37 PM, Linss, Peter wrote:
>>>> On Jul 31, 2013, at 3:43 PM, Tobie Langel wrote:
>>>>> What is preventing us from consolidating the CSS and web-platform-test repositories at this stage?
>>>> 
>>>> 
>>>> The fact that we have ~2000 review comments over the course of two years in a pre-existing system that we're not willing to throw away. The plan of record is that I'm going to be adapting our existing system to integrate with GitHub.
>>> I don't understand why switching to the main repo would cause you to loose these two years of review comments. Surely, these could be kept accessible despite the new system being hosted elsewhere.
>> 
>> The comments themselves wouldn't be lost per se, we'd just lose all correlation between the assets in the old repo and the new one making the old data effectively useless. Furthermore, because we have allowed tests to be landed prior to review, everything that isn't currently approved will have to have an issue filed against it to match the new model. The current number is over 4300 tests and references in that state. Hopefully you can see that this is untenable without some automation.
> 
> I'm happy to help with the latter. Import scripts are a breeze to write with GitHub's API.
>> Beyond that, we also have a significant infrastructure built around our current repository, i.e. the build system, test harness, spec annotations, etc.
> 
> Sure. I'm not convinced the build system and test harness are actually very useful given that:
> 1) the harness is manual (and we should now be able to run reftests automatically using WebDriver), and

Unfortunately WebDriver doesn't help with the thousands of manual tests we have (let alone the new ones we're getting). There are large parts of CSS that no one has figured out how to create reftests for, let alone scripted tests. While they're a last resort, we can't ignore the need for manual testing.

> 2) afaik vendors pull the repo directly when they wish to run the tests (rather then use the output of a build system).

The output of the build system is still used for CSS tests, historically (and currently to the best of my knowledge) people mostly consume the build output of our test suites, not the source. And FWIW, there are other people who run tests aside from vendors.

The build system isn't something the CSSWG can throw away at this point. We use it for test format conversions, many of our test suites are contributed in a mix of XHTML and HTML, and we need the full suites in both HTML and XHTML formats, we've also relied on XHTML-Print. In addition, the build system produces manifest files and human readable test suite indexes, which are also being used.

> 
> On the other hand, the spec annotation system seems something that would be very useful if merged with the coverage tools Robin started and I pursued.

The spec annotations are a product of the test harness, not Shepherd.

Shepherd also has a test coverage API and I believe the data it gathers is pretty much a superset of your coverage tools at this point. Shepherd's spec parser has also had a number of recently enhancements (to identify the type of <dfn> anchors and parse IDL definitions) and is getting more. 

> Some of the CableLabs folks also have interesting ideas around this (we recently discussed hashing requirements to identify changes in the spec at a finer grained level than sections).

I'd like to hear more about that (and be involved in those discussions).

>> I'm also working on making the build system more generic so it can run against the new repo layout. This has to be completed before we can do any significant rearranging of our files.
> 
> Could you share the requirements of the build tools somewhere?

The CSSWG's build tools are about 3-4 years old and I don't have a formal requirements doc for it at this point. I'll be writing some documentation for it after I finish re-working it.

> The rest of the main repo doesn't need a build step, the files just need to be (properly) served. We need to discuss a plan to be able to do something similar with the CSS WG content as we can't require a build step for just parts of the repo (we could however, serve the files in such a way that they appear to have been built).

As has been said before, serving the source for the CSS test suites isn't sufficient. I've been clear about this for some time. Having a single consolidated system isn't going to work if it ignores some of the basic requirements of its users.


>>> Furthermore, the main repo's review model and test lifecycle models are very different from the CSS WG's: test are approved when pulled in. If a test has a problem, an issue is filed against it. That's all there is to it. No test status/life cycle, etc. just regular versioning. The CSS WG will need to adopt this workflow when switching to the main repo or convince everyone else to adopt the CSS WG's way (which is highly unlikely).
>> 
>> We will be adopting the new workflow when we switch, in fact, we've already adopted it to a large extent by mirroring our repo on GitHub and accepting PR in the same fashion as the main repo does. Part of the changes to Shepherd are to make this workflow work with the existing systems we have.
> 
> OK, so I wasn't sure about that and that's great news.
>> Also note that historically, we allow un-approved tests into our repository so that they can get included in our test suite builds and evaluated by actually using them in our running test suites without calling them 'approved'. In other words, we haven't blocked all tests waiting for someone to review them. If we did, we'd still be trying to get CSS2.1 out the door.
>> 
>> I'm not convinced that the current model of not merging tests until they're reviewed is going to work for every situation, but that's another discussion.
> Yes, this is why I'm busy making sure test review no longer is the bottleneck through process modifications and tooling.
>>> I'm not sure what benefits there are in delaying the switch further.
>> 
>> As I said above, the benefit to us is not breaking a while bunch of infrastructure that we've been using for years and are relying on. We have to be able to shift our infrastructure before we can move the tests. In the meanwhile, we're doing what we can to minimize the differences.
> 
> So of course I understand wanting to preserve the infrastructure the WG is relying on for its day to day operations. That makes complete sense.
> 
> However, I do also feel that parts of the infrastructure the CSS is relying upon is getting obsoleted by changes in the process (e.g. the lifecycle model and review system superseded by the GitHub workflow), new technology (e.g. WebDriver and SaaS like saucelabs obsolete non-automated reftest harnesses) and external requirements (e.g. the build system needs to be revised to fit the way tests from other WG are handled).

Yes, some parts of our infrastructure are being obsoleted, but we have to replace them with new parts that still meet our requirements while fitting in to the new model.

>>>> Once that's working, and we can merge our existing review data, we'll move our tests into the main repo.
>>> 
>>> What's the timeline for that?
>> 
>> Sometime this year. I'm actively working on it. 
> 
> It would be good to sync up to see if/how you're planning to address some of the issue I brought up above and when you think they can be built in.

Agreed, it would also be good to hear how some of the CSSWG's needs are being addressed by the new system. 

> 
> I'd also like to see how I could help make this faster.
>>> Keeping a common set of docs and process across the two repos is already proving to be difficult. As we start building more infrastructure around the main repo, I'm concerned we'll quickly get further apart than we're now.
>> 
>> Well, a good way to minimize the drift is to keep me in the loop with the decisions and plans for the infrastructure that's being build around the main repo. That way I can keep our systems inline to the best extent possible.
> 
> All conversations around this happen on GH, irc (#testing or sometimes, accidentally, other irc channels) and public-test-infra@. 
>> I'm not sure where the disconnect is, maybe I missed subscribing to a mailing list, or people keep forgetting to invite me to ad-hoc meetings, or something, but I often have the impression that decisions for the new infrastructure are being made in isolation without any kind of public discussion, let alone due consideration for other working group's existing needs.
> 
> I find that criticism unfounded and, tbh rather harsh.

That wasn't criticism, that was my impression of the situation, which simply is what it is. I also wasn't assigning blame, but stating a problem that IMO needs to be addressed. I've already cited examples of the CSSWG's needs that I don't see being addressed, I can list more, but this isn't me bitching, this is me asking for, and offering, help to fix the situation.

> 
> Other than the coverage tool Robin and I worked on earlier this year and which we urgently needed for the funding effort, the only infra work that's been done so far has been announced on public-test-infra@[1] with a detailed plan posted to the Wiki[2] which was later refined through input from the community[3].
> 
> This work actually addressed an issue brought up by a number of members of different working groups regarding the exaggerated number of emails they were receiving.
>> I'm sure there are also plans for the new infrastructure that can be achieved more easily by leveraging systems that we've already built rather than starting all over from scratch.
> 
> I assume you're referring to Shepherd here, as there's broad agreement that the current test framework is not a good starting point for this effort.

Yes, Shepherd, as well as the build tools we have. I've also agreed that the current test framework's code base isn't worth re-using, but the data for the roughly 300,000 test results that we have in it _is_ worth keeping (which is also the data used to generate the spec annotations).

> 
> Shepherd's design goals differ significantly from those pursued by the testing effort since moving to Git/GitHub. Where Shepherd handles test management, we prefer deferring to Git/GitHub for that.

Shepherd's goals are actually the same, to get the job done. It's initial design predates the move to GH by several years, but it still does a number of things that GH doesn't do, and never will. The goal going forward for Shepherd is to integrate with GH, serving as a backup store of all the review data in GH, as well as augmenting the GH process with the bits it doesn't do.

> Shepherd integrates with a manual test harness for running ref tests, we'd like to rely on WebDriver[4] and a SaaS for that.

Actually, Shepherd doesn't integrate with the test harness at all. Parts of its DB are designed to be leverage-able for a future test harness (like the spec data) but that's all the integration there is.

> 
> Shepherd also seems to favor an integrated approach, whereas we're favoring the flexibility of a more modular solution.

No, Shepherd isn't designed to be a fully integrated system, but just a module in the complete solution space. The way it's code dependencies and data stores are structured are already somewhat modular. It does integrate a number of different functions, but that's more of being able to leverage existing code and systems rather than a design goal.

> 
> All in all, I feel the cost of adapting Shepherd to our requirements would prove time-consuming and ineffective.

We disagree here, obviously, but since the time spent on that effort is mine...

> That said, I still feel there are parts of Shepherd that could be used by this system, such as those related to spec coverage, that you've kindly extracted from the main project.

There are other parts of Shepherd that also make sense to add to the GH-based system, such as the automatic validation of test data (which, when the integration is complete, will be running against pull requests and filing issues automatically). Also its abilities to search across the test repo by various aspects of the test metadata and to generate statistics are quite useful. Most of these capabilities are already accessible via a JSON api (what isn't now, will be soon).

>> I'm more than willing to contribute, but I have to know what those plans are...
> 
> The plans are those described in the test plan[5] that has been shared previously or those more detailed that have been written up on the wiki or in this mailing list. I'm preparing a more detailed plan of the first phase of the infra work which I should be able to share shortly.

Looking forward to it. I think seeing those details, and having an opportunity to contribute to those details will help alleviate my concerns raised above.

> 
> Re contributing, the number one area where help is needed right now is the documentation. That's also been announced on the mailing list and has a GH project[6] with plenty of open tickets you can pick from. Another area where you contribution would be most useful would be to focus on the parts of shepherd that are critical to enabling the migration of the CSS WG to the main repo.

Which I've been actively working on.

> I'm happy to help define these.

I think I have a good handle on what is needed there.

> Sharing a roadmap for Shepherd would also be useful,

Agreed, the current design documents for it on the CSSWG wiki are several years out of date. Updating that is something I can work on.

> so would moving its development to GitHub.

It's always been developed on a public repo, but I can set up a GH mirror easily enough. I was planing on moving the build tools to GH as well once the re-work is complete.

Peter

> 
> Hope this helps.
> 
> Best,
> 
> --tobie
> 
> --
> [1]: http://lists.w3.org/Archives/Public/public-test-infra/2013AprJun/0078.html
> [2]: http://www.w3.org/wiki/Testing/Infra/Notification_Hell
> [3]: http://lists.w3.org/Archives/Public/public-test-infra/2013JulSep/0080.html
> [4]: http://lists.w3.org/Archives/Public/public-test-infra/2013JanMar/0056.html
> 
> 
> [5]: http://www.w3.org/2013/04/test_plan2.html
> [6]: https://github.com/w3c/testtwf-website
>
Received on Tuesday, 6 August 2013 18:01:56 UTC