- From: Arthur Barstow <art.barstow@nokia.com>
- Date: Thu, 21 Nov 2013 09:32:41 -0500
- To: James Graham <james@hoppipolla.co.uk>, public-test-infra <public-test-infra@w3.org>
On 11/13/13 11:17 PM, ext James Graham wrote: > One of the biggest problems that we currently face with the test > repository is the large number of tests that are currently stuck in > code review. This causes a number of significant problems; most > obviously such tests aren't being run and so aren't helping to improve > the web. Having a long review queue is also a significant turn off to > potential or existing contributors who may feel like there is no point > in contributing if their tests are going to sit in review for months > at a time. By the time that tests are reviewed such contributors have > often moved on to work on other things, so they have limited or no > time to deal with followup issues. > > The fundamental reasons that code review seems to be slow are: > > * It is hard work > > * It is not super-rewarding > > * No one is being paid to do it > > It's not clear that any of these are solvable problems, so to fix the > problem we are going to have to change something about the way that we > do reviews rather than merely hoping that they will happen sooner in > the future. In looking for a solution, we should keep in mind several > goals of reviews: > > * Check that tests are correct per spec > > * Check that tests are written in a robust way > > * Check that tests are using best practices (e.g. using testharness.js > in the right way, conforming to style conventions, etc.) > > * Ensure that tests are not using server-side script in a malicious way > > The problem then is how to avoid front-loading review burden whilst > sacrificing as few of the above goals as possible. One important > observation is that once people are actually running tests they have > motivation to investigate tests that fail in their implementation and > fix them if they are incorrect. So over long enough time periods > incorrect fails should sort themselves out (although there might be > incorrect passes that get missed, and so on). > > One approach that was identified at TPAC was to allow test review to > be forwarded from other public review systems. So, for example, if > someone at Mozilla got r+ on a patch containing web-platform-tests > changes, this would be allowed to automatically carry over to the > web-platform-tests review by merging the PR and adding a comment to > the original review, so such patches would land immediately. This, > then, would effectively split the review burden into two parts; > patches coming from implementers which would typically need no further > review, and patches coming from individual contributors which would > typically still require upfront review. Compared to the above list of > goals for review, allowing implementer review to carry forward > directly would slightly reduce the effectiveness at ensuring tests are > correct per spec; such tests would presumably be more likely to match > the specific behaviour of one browser even if it is based on a > misunderstanding of the spec requirements. Assuming the implementer > does competent review it should not significantly affect robustness > since implementers running the tests have a significant motivation to > make them robust. It would significantly reduce the chance that the > tests had been reviewed correctly for best practices since the > original reviewer might have expertise in the browser code rather than > in the web-platform-tests conventions. Again, assuming that the > implementer is running the tests, it should be possible to avoid > malicious code entering the system this way since the vendor would be > exposing themselves to any security problems. > > Given that there is a history of vendors submitting tests that do in > fact match their implementations but do not in fact match the spec, > and tests that have significant style issues, it does seem that some > mechanism is needed to allow external review of these tests. Two > possibilities have been suggested so far; that the tests only get > approved after some timeout (i.e. review can be carried forward, but > we wait for N days before actually merging to give a chance for people > to review), or that we keep an out-of-band record of which PRs were > only reviewed within an organization and allow people to do explicit > post-hoc review. I don't know which of these options is better; or if > there are further options. I think the second has some advantages for > integrating with UA import systems since it keeps the code more > in-sync between the UA repo and the external repo. Both also require > some additional tooling. > > Comments? Thanks very much for writing this James! Do you see these two options as mutually exclusive? I ask because it seems like both could be "acceptable", although neither is really ideal. I agree with you and Dom the second option would be preferable but in the event we can't get that level of review, perhaps the timeout option could be used (as a last resort), thus giving a review preference hierarchy from highest to lowest of: * External Review - review is done by an external people * Internal Review - review is done within an org but not by the test creator * Review by Timeout If Review by Timeout, is considered acceptable, I suppose there could be inclination for groups to do no review at all but OTOH, hopefully folks consider test review sufficiently useful that the Timeout option would be rarely used. -AB
Received on Thursday, 21 November 2013 14:34:27 UTC