Stability testing of PRs

Unstable tests are one of the biggest problems with running 
web-platform-tests — or any tests — on browser CI infrastructure, and in 
particular I have found that unreliable web-platform-tests are one of 
the biggest problems when importing tests into the Mozilla CI system, 
often forcing me to perform multiple, slow, end-to-end runs, and disable 
tests, before landing the change.

In order to alleviate this problem, I implemented a travis job that 
checks submitted tests produce stable results in 10 runs of the latest 
public version of Firefox / Chrome. The code is in PR/3975 [1]. This PR 
is pending a release of the Firefox remote control library (marionette) 
and code review, but after those conditions are met I intend to turn it 
on as soon as possible. I would also like to add Edge and Safari; Edge 
seems possible using Appveyor, Safari may be possible on Travis. However 
given the relative difficulty of testing those browsers locally I don't 
intend to work on this immediately.

I expect there will be some cases where this job fails due to legitimate 
browser bugs causing instability. In that case I think that a comment 
indicating that the test author has investigated the issue and concluded 
that it must be a browser bug should be enough for an admin to merge in 
this case.

Does anyone have any concerns about adding this check?

[1] https://github.com/w3c/web-platform-tests/pull/3975

Received on Monday, 17 October 2016 14:43:18 UTC