Re: EME test reports from Mark Watson on 2016-09-01 (public-html-media@w3.org from September 2016)

From: Mark Watson <watsonm@netflix.com>
Date: Thu, 1 Sep 2016 08:38:23 -0700
To: "Jerry Smith (WPT)" <jdsmith@microsoft.com>
Cc: David Dorwin <ddorwin@google.com>, "public-html-media@w3.org" <public-html-media@w3.org>, Philippe Le Hégaret <plh@w3.org>, Paul Cotton <Paul.Cotton@microsoft.com>, Matthew Wolenetz <wolenetz@google.com>
Message-ID: <CAEnTvdCEDiGaqnJ+E7NGkUHrxmZWvasFZ7K0bRtFJneG4VmLgQ@mail.gmail.com>

Another issue with the test reports is that at present the
persistent-usage-record tests require the use of Microsoft servers, but the
test scripts don't automatically switch server.

I filed an issue for this:
https://github.com/w3c/web-platform-tests/issues/3626

Also, we have two infrastructure tasks:
- https://github.com/w3c/web-platform-tests/issues/3624
- https://github.com/w3c/web-platform-tests/issues/3625

...Mark

On Mon, Aug 29, 2016 at 2:17 PM, Jerry Smith (WPT) <jdsmith@microsoft.com>
wrote:

> On *clearkey-mp4-playback-temporary-waitingforkey.html:*  I don’t know
> the history, but the test report tool is written to ignore tests that
> result in all timeouts.  In fact, in its normal version it ignores tests
> with only 1 pass/fail result.  I’ve worked around that for our currently
> published reports.
>
>
>
> If I count tests that all timeout as fails, the “complete failure” group
> jumps dramatically:
>
>
>
> ·       *The modified report is:**  Completely failed files*: 54; *Completely
> failed subtests*: 45; *Failure level*: 45/294 (15.31%)
>
> ·       Vs. online which is:  *Completely failed files*: 50; *Completely
> failed subtests*: 10; *Failure level*: 10/299 (3.34%)
>
>
>
> We clearly need to better understand what’s causing the timeouts.
>
>
>
> Jerry
>
>
>
> *From:* David Dorwin [mailto:ddorwin@google.com]
> *Sent:* Monday, August 29, 2016 1:18 PM
> *To:* Mark Watson <watsonm@netflix.com>
> *Cc:* Jerry Smith (WPT) <jdsmith@microsoft.com>; public-html-media@w3.org;
> Philippe Le Hégaret <plh@w3.org>; Paul Cotton <Paul.Cotton@microsoft.com>;
> Matthew Wolenetz <wolenetz@google.com>
> *Subject:* Re: EME test reports
>
>
>
> A couple notes on the results:
>
>    - clearkey-mp4-playback-temporary-waitingforkey.html times out on all
>    three browsers but doesn't appear in complete-fails.html or
>    less-than-2.html. Is this a bug in the report generation tool?
>    - drm-keystatuses.html fails or times out on all browsers. We may want
>    to review that test.
>
>
>
> On Mon, Aug 29, 2016 at 12:44 PM, Mark Watson <watsonm@netflix.com> wrote:
>
> I think we should replace the key system name in the test name with either
> 'drm' or 'clearKey'. Since each browser supports only a single DRM (that we
> can test), then we know the keysystem that is being tested in each case.
>
>
>
> ...Mark
>
>
> On Aug 29, 2016, at 12:36 PM, Jerry Smith (WPT) <jdsmith@microsoft.com>
> wrote:
>
> One thing we should probably resolve:  Including key system strings in the
> subtest name is creating misleading test gaps vs. our 2 passing
> implementation requirement.  The wptreport tool treats these as separate
> tests.  All com.microsoft.playready subtests are flagged currently as not
> having two passing implementations.
>
>
>
> As a simple experiment, I edited the JSONs to pull the key system names
> from the drm subtest names.  That results in:
>
>
>
> -          *Test files*: 105; *Total subtests*: 257
>
> -          *Test files without 2 passes*: 29; *Subtests without 2 passes:
> *49; *Failure level*: 49/257 (19.07%)
>
> -          *Completely failed files*: 29; *Completely failed subtests*:
> 8; *Failure level*: 8/257 (3.11%)
>
>
>
> Vs. online (with key system names in subtest names);
>
>
>
> -          *Test files*: 105; *Total subtests*: 299
>
> -          *Test files without 2 passes*: 50; *Subtests without 2 passes:
> *91; *Failure level*: 91/299 (30.43%)
>
> -          *Completely failed files*: 50; *Completely failed subtests*:
> 10; *Failure level*: 10/299 (3.34%)
>
>
>
> That means:
>
>
>
> -          Listing the key system in the subtest names results in 42
> additional subtests.
>
> -          Most of these have more than one passing implementation
> presently.
>
> -          2 subtests (drm-keystatuses-multiple-sessions) report as
> complete fail with key system in the subtest name, but have one passing
> implementation when key system is not.
>
>
>
> Options to handle this are:
>
>
>
> -          Remove the keysystem names from the subtests, but lose
> visibility into keysystem specific failures.
>
> -          Leave them in and post those results to the website, but
> prepare manual tallies when we submit PR for review.
>
>
>
> Jerry
>
>
>
>
>

Received on Thursday, 1 September 2016 15:38:54 UTC