Re: Code review improvements

On 11/13/13 11:17 PM, ext James Graham wrote:
> One of the biggest problems that we currently face with the test 
> repository is the large number of tests that are currently stuck in 
> code review. This causes a number of significant problems; most 
> obviously such tests aren't being run and so aren't helping to improve 
> the web. Having a long review queue is also a significant turn off to 
> potential or existing contributors who may feel like there is no point 
> in contributing if their tests are going to sit in review for months 
> at a time. By the time that tests are reviewed such contributors have 
> often moved on to work on other things, so they have limited or no 
> time to deal with followup issues.
> The fundamental reasons that code review seems to be slow are:
> * It is hard work
> * It is not super-rewarding
> * No one is being paid to do it
> It's not clear that any of these are solvable problems, so to fix the 
> problem we are going to have to change something about the way that we 
> do reviews rather than merely hoping that they will happen sooner in 
> the future. In looking for a solution, we should keep in mind several 
> goals of reviews:
> * Check that tests are correct per spec
> * Check that tests are written in a robust way
> * Check that tests are using best practices (e.g. using testharness.js 
> in the right way, conforming to style conventions, etc.)
> * Ensure that tests are not using server-side script in a malicious way
> The problem then is how to avoid front-loading review burden whilst 
> sacrificing as few of the above goals as possible. One important 
> observation is that once people are actually running tests they have 
> motivation to investigate tests that fail in their implementation and 
> fix them if they are incorrect. So over long enough time periods 
> incorrect fails should sort themselves out (although there might be 
> incorrect passes that get missed, and so on).
> One approach that was identified at TPAC was to allow test review to 
> be forwarded from other public review systems. So, for example, if 
> someone at Mozilla got r+ on a patch containing web-platform-tests 
> changes, this would be allowed to automatically carry over to the 
> web-platform-tests review by merging the PR and adding a comment to 
> the original review, so such patches would land immediately. This, 
> then, would effectively split the review burden into two parts; 
> patches coming from implementers which would typically need no further 
> review, and patches coming from individual contributors which would 
> typically still require upfront review. Compared to the above list of 
> goals for review, allowing implementer review to carry forward 
> directly would slightly reduce the effectiveness at ensuring tests are 
> correct per spec; such tests would presumably be more likely to match 
> the specific behaviour of one browser even if it is based on a 
> misunderstanding of the spec requirements. Assuming the implementer 
> does competent review it should not significantly affect robustness 
> since implementers running the tests have a significant motivation to 
> make them robust. It would significantly reduce the chance that the 
> tests had been reviewed correctly for best practices since the 
> original reviewer might have expertise in the browser code rather than 
> in the web-platform-tests conventions. Again, assuming that the 
> implementer is running the tests, it should be possible to avoid 
> malicious code entering the system this way since the vendor would be 
> exposing themselves to any security problems.
> Given that there is a history of vendors submitting tests that do in 
> fact match their implementations but do not in fact match the spec, 
> and tests that have significant style issues, it does seem that some 
> mechanism is needed to allow external review of these tests. Two 
> possibilities have been suggested so far; that the tests only get 
> approved after some timeout (i.e. review can be carried forward, but 
> we wait for N days before actually merging to give a chance for people 
> to review), or that we keep an out-of-band record of which PRs were 
> only reviewed within an organization and allow people to do explicit 
> post-hoc review. I don't know which of these options is better; or if 
> there are further options. I think the second has some advantages for 
> integrating with UA import systems since it keeps the code more 
> in-sync between the UA repo and the external repo. Both also require 
> some additional tooling.
> Comments?

Thanks very much for writing this James!

Do you see these two options as mutually exclusive? I ask because it 
seems like both could be "acceptable", although neither is really ideal.

I agree with you and Dom the second option would be preferable but in 
the event we can't get that level of review, perhaps the timeout option 
could be used (as a last resort), thus giving a review preference 
hierarchy from highest to lowest of:

* External Review - review is done by an external people

* Internal Review - review is done within an org but not by the test creator

* Review by Timeout

If Review by Timeout, is considered acceptable, I suppose there could be 
inclination for groups to do no review at all but OTOH, hopefully folks 
consider test review sufficiently useful that the Timeout option would 
be rarely used.


Received on Thursday, 21 November 2013 14:34:27 UTC