Re: "priority" of tests from Philip Jägenstedt on 2017-05-10 (public-test-infra@w3.org from April to June 2017)

From: Philip Jägenstedt <foolip@google.com>
Date: Wed, 10 May 2017 18:41:02 +0000
To: John Jansen <John.Jansen@microsoft.com>, "public-test-infra@w3.org" <public-test-infra@w3.org>, "jeffcarp@chromium.org" <jeffcarp@chromium.org>
Message-ID: <CAARdPYebWC4AnB60xL+7W9Fy789N4Xwdvk6bt3fkEahVtGAejA@mail.gmail.com>

When looking at an individual test it's probably easy to place it on the
real-world<->edge-case spectrum and make a judgement, but I assume you're
looking for something that will scale better.

We did a one-off triage of tests that fail in Chrome but pass in Firefox
and Edge <https://bugs.chromium.org/p/chromium/issues/detail?id=651572> back
in Sep 2016. I would say the signal:noise ratio was a bit on the low side,
but I'm still optimistic about this method of finding tests worth
investigation.

We have an upcoming wpt dashboard, which +Jeff Carpenter
<jeffcarp@chromium.org> is working on. A preview is public at
https://wptdashboard.appspot.com but the data isn't entirely fresh. Once
this is closer to done, filtering out tests which fail in some browser but
pass in 2 or 3 others should hopefully be a guide to which things to
investigate first. Especially if only one browser is failing and the test
is on the real-world side of the spectrum, then that browser alone is
holding back interop and can solve the problem by fixing just one bug.

(I think one could learn something by inspecting just the repository and
its history, but guess that something based on the test results is going to
be much more useful.)

Have you explored any other ideas?

On Wed, May 10, 2017 at 7:31 PM John Jansen <John.Jansen@microsoft.com>
wrote:

> Good morning,
>
> I'm trying to work out how to prioritize test failures seen with Web
> Platform Tests.
>
> We've had this discussion in the past, but I'm wondering if anyone on this
> list has had any inspired discovery or realization that might make things a
> bit better...
>
> I know for browser vendors this is incredibly challenging. Say we see 100
> failures in one test file, currently there is no way for me to know if
> those 100 failures are more or less important to the web than a single
> failure in some other test file. Of course, the priority for Edge cannot be
> determined by Chrome, so I am not asking for browser vendors to somehow
> dictate this. I'm wondering instead if there is a way we could have the
> people who write the tests or the people who write the specs (or both) come
> to some type of ranking.
>
> I am not sure how it looks. Perhaps "We've seen this construct actually
> used on sites" means it's HIGH priority. Or maybe, "No web dev would ever
> try to pass this invalid value in" means its LOW priority.
>
> Maybe people have already had this conversation and I'm not in the loop.
>
> Anyone?
>
> -John
>
>

Received on Wednesday, 10 May 2017 18:41:49 UTC