RE: Additional EME tests

This is a good cut, Greg.  I do think we need to support multiple DRMs (though perhaps not all in the first cut of tests).  Personally, I would probably limit content to CENC (again for the first path).  I’d be interested in what others think about the importance of testing WebM as well.  I’d like the test to consider testing other formats, but would not call it a first sprint goal.

The DRMs we support could be chosen based on candidate browsers that we think will contribute to implementation reports.  I think that argues for WideVine, PlayReady and perhaps Adobe Primetime.  We’d want to be able to add other keySystems later.

Jerry

From: Greg Rutz [mailto:G.Rutz@cablelabs.com]
Sent: Tuesday, June 28, 2016 8:40 PM
To: public-html-media@w3.org
Subject: Re: Additional EME tests

Based on some of the discussions in this chain, I have put together a high-level list of development tasks that would be required.  Please feel free to comment on this basic approach.  I borrowed from Mark’s list of functional areas on which we should generalize the test suite.

  *   Develop a set of configuration properties that indicate the various features that the user-agent-under-test claims to support.  The test suite will take this set of properties as input.  Here are the likely configuration sets:

     *   Media format support

        *   Containers (WebM, MP4)
        *   Audio Codecs (hopefully we can narrow this down to 1 or 2 variations that are supported by all user agents to reduce the number test vectors required)
        *   Video Codecs (same as audio)

     *   InitData Types (WebM, CENC)

        *   For CENC, we might want to test both CDM-proprietary and common PSSH formats

     *   Session Types (temporary, persistent-license, persistent-usage-record)
     *   Key Systems (Widevine, PlayReady, FairPlay, etc)

  *   Update the existing Blink tests to utilize the input configuration to perform tests appropriate for what the user-agent claims to support

     *   This will involve developing some multi-purpose EME and DRM-license request code (CastLabs DRMToday) that will likely be shared by several of the tests

  *   Develop any additional tests that are deemed essential but are not yet covered by the Blink tests

     *   Persistent sessions is one example

  *   Create all test vectors that will be used for test execution

     *   My hope is that only a few (short!) test vectors will be required.  With multi-DRM CENC, we should be able to exercise most of the EME APIs.
     *   Some EME APIs/events may be exercised not by changing the media or initData, but by making a custom license request that attaches specific “rights” to the key (like expiration time)


On 6/24/16, 6:45 PM, "Mark Watson" <watsonm@netflix.com<mailto:watsonm@netflix.com>> wrote:

All,

I'm away for the coming week, but here are some things I have been tinkering with that I hope may be useful:

The clearkeysuccess.html test here<https://github.com/mwatson2/web-platform-tests/tree/clearkey-success/encrypted-media> now passes for all session types on Mac Chrome and Firefox. I am not suggesting that we use this test as is, since this discussion is leaning toward more granular tests and a different approach to combinations, but it may be useful to have something working to compare with / steal from.

To make this work, I had to provide three further scripts:

  *   chrome-polyfill.js<https://github.com/mwatson2/web-platform-tests/blob/clearkey-success/encrypted-media/chrome-polyfill.js> fixes a small bug<https://bugs.chromium.org/p/chromium/issues/detail?id=622956> in Chrome related to a missing keystatuschange message
  *   firefox-polyfill.js<https://github.com/mwatson2/web-platform-tests/blob/clearkey-success/encrypted-media/firefox-polyfill.js> fixes a small bug<https://bugzilla.mozilla.org/show_bug.cgi?id=1282142> in Firefox related to requestMediaKeySystemAccess
  *   clearkey-polyfill.js<https://github.com/mwatson2/web-platform-tests/blob/clearkey-success/encrypted-media/clearkey-polyfill.js> provides support for persistent sessions for Clear Key on browsers that do not support them natively
I am not yet testing reloading of a persistent session after it has been closed, but I'd be happy to add that when I get back.

Hope this is useful.

...Mark



On Fri, Jun 24, 2016 at 7:39 AM, Paul Cotton <Paul.Cotton@microsoft.com<mailto:Paul.Cotton@microsoft.com>> wrote:
I am moving this EME Editor-only discussion to public-html-media@w3.org<mailto:public-html-media@w3.org> so that all HME WG members are aware of the discussions about an EME test suite.  Please continue the discussion on this thread.

FTR the original discussion can be found starting at:
https://lists.w3.org/Archives/Public/public-hme-editors/2016Jun/0100.html


/paulc
HME WG Chair


From: David Dorwin [mailto:ddorwin@google.com<mailto:ddorwin@google.com>]
Sent: Thursday, June 23, 2016 1:46 PM
To: Greg Rutz <G.Rutz@cablelabs.com<mailto:G.Rutz@cablelabs.com>>
Cc: Mark Watson <watsonm@netflix.com<mailto:watsonm@netflix.com>>; public-hme-editors@w3.org<mailto:public-hme-editors@w3.org>; Francois Daoust <fd@w3.org<mailto:fd@w3.org>>; Philippe Le Hégaret <plh@w3.org<mailto:plh@w3.org>>; John Rummell <jrummell@google.com<mailto:jrummell@google.com>>; Ralph Brown <r.brown@cablelabs.com<mailto:r.brown@cablelabs.com>>
Subject: Re: Additional EME tests

We have tools to generate WebM files, which are DRM-independent. There are two included in the Google directory: test-encrypted.webm and test-encrypted-different-av-keys.webm. The key IDs and keys are in encrypted-media-playback-two-videos.html and encrypted-media-playback-multiple-sessions.html, respectively. For simplicity, I suggest using these for the encryption of CENC files. (We also have some CENC files, encrypted with these keys but I believe they only have initData for the common format and Widevine.) We should move the key IDs and keys to a common location, though. (The initData values in encrypted-media-utils.js appear to contain different dummy values.)

Regarding Mark's comments about combinations, I don't think there is much allowance/expectation for variance in what combinations are supported, mainly because this would break the usability of the API. For example, if two containers are supported but only one supported a session type, a configuration containing the combination would be rejected. initData types within media are also constrained, but implementations are required to support generateRequest() of all supported initDataTypes regardless of the actual media.

I agree that we need to detect what is supported, at least for the simple spec test case, using some utility function. The Blink tests uses getSupportedInitDataType(), etc., though there is probably room for improvement. See also my comments inline Francois's reply, which I've copied here to unfork the thread.

On Thu, Jun 23, 2016 at 1:48 AM, Francois Daoust <fd@w3.org<mailto:fd@w3.org>> wrote:
Hi David,

I've been wondering about the same things for the MSE test suite. Some comments inline.


Le 23/06/2016 09:01, David Dorwin a écrit :
For Blink, we tried to follow our understanding of the WPT style, which
was that each test case be a separate file. In some cases, especially
the syntax tests, there are multiple categories of tests together. I
think readability is also important, even if it means duplication. (Of
course, API changes or refactorings can be monotonous when they have to
be applied to many files, but that should be rarer now.) As to which
approach we take for new tests, I defer to the WPT experts.

I don't qualify as WPT expert, but my understanding is that it is somewhat up to the people who write and review the tests. In the MSE test suite, a given test file often checks a particular algorithm and contains multiple test cases to check the different steps. I personally find that approach useful and readable as well.
I think we probably do want individual tests for various media types,
etc. For example, downstream users (i.e. user agent vendors) should be
able to say "I know I don't support foo, so all the "-foo.html" tests
are expected to fail. For tests that aren't specifically about a type
(or key system), the tests should select a supported one and execute the
tests.

I quickly glanced at the HTML test suite for media elements to see how tests were written there:
https://github.com/w3c/web-platform-tests/tree/master/html/semantics/embedded-content/media-elements


Most test files seem to pick up a supported MIME type, using common functions defined in:
https://github.com/w3c/web-platform-tests/blob/master/common/media.js


There are exceptions to the rule, such as tests on the "canPlayType" method that contain test cases explicitly marked as "(optional)":
http://w3c-test.org/html/semantics/embedded-content/media-elements/mime-types/canPlayType.html


For MSE, most tests can be written without having to impose a particular MIME type (with a few exceptions as well, e.g. to test the "generate timestamps flag"), and it seems a good idea to keep the number of MIME-type specific tests minimal to improve the readability of the implementation report. Whenever possible, we need the MIME-agnostic version of the tests to assess the "at least two PASS" condition in the report.
Ideally, it would be possible to force such tests to run all supported
variants. For example, Chrome might want to run the tests with both MP4
and WebM. encrypted-media-syntax.html, for example, tries both WebM
and/or CENC types based on whether they are supported, requires all
supported to pass, and ensures that at least one was run. This has the
advantage of testing both paths when supported, though it's not
verifiable anywhere that both ran. I don't know whether it would be
useful to be able to say run all the tests with WebM then repeat with CENC.

I've been wondering about that as well for MSE tests. Passing a test for a given MIME type does not necessarily imply that the test also passes if another supported MIME type gets used. It would make tests harder to write though (more error-prone, harder to debug, and slightly harder for user agent vendors to tell what failed in practice). It's often easier to create one test case per variant.

After I sent this, I realized that example (encrypted-media-syntax.html) won't scale to larger tests, such as playback. It might be that this was the easiest way for us to add coverage without deciding on some larger infrastructure for running multiple variants.

In the end, what could perhaps work is to create a "createGenericAndVariantTests" method which takes a list of variants as input, replaces the usual calls to "test" or "async_test", and generates a generic test case that picks up the first supported variant together with a set of variant test cases marked as optional that test the same thing for each and every variant.

I agree that some way to run the tests with variants is probably the ideal mechanism for general tests. (I'd still like to have -keyids.html, etc. tests where we are specifically testing those capabilities.) I also agree that the tests should be easy to write and maintain. Another option would be to make it possible for the "runner" to override the types that are automatically selected by the utility function that picks that type to test (e.g. getSupportedInitDataType()).

The generic test case would give the result needed for the implementation report. The additional optional test cases could help user agent vendors detect additional issues with a particular variant and such tests should be easy to filter out from the implementation report as needed if they are consistently flagged with "(optional)".

Francois.

On Thu, Jun 23, 2016 at 10:08 AM, Greg Rutz <G.Rutz@cablelabs.com<mailto:G.Rutz@cablelabs.com>> wrote:
I have a toolchain that can generate MP4 CENC content with multiple DRMs using the CastLabs DRMToday<http://drmtoday.com/> service.  With these tools I can select my own key/keyID, encrypt the content and ingest the key into the DRMToday license server (as many of you know, CastLabs has graciously agreed to provide an account to facilitate the W3C EME testing platform).  In addition, I have a very simple proxy server (required by the DRMToday architecture to sign license requests on behalf of the account owner) which can assign “rights" to each key to provide a customized license.  The system is quite flexible and we would be able to customize the rights for each test with only a single piece of content.  This may be valuable if we need to test key expiration or other rights-related operations that would be exposed to applications through the EME APIs.

Please note that my toolchain has the following limitations and will require some development if we require more features:

  *   Only generates CENC initData (not DRM-specific variants).
  *   No WebM support
  *   ClearKey, PlayReady, Widevine, DRMs only.  Notable missing DRMs  — Adobe Primetime, Apple FairPlay (both indicated as supported by CastLabs)
CableLabs has volunteered my time to support the integration/use of these tools if we think they will be valuable.

G

On 6/23/16, 10:35 AM, "Mark Watson" <watsonm@netflix.com<mailto:watsonm@netflix.com>> wrote:



On Thu, Jun 23, 2016 at 12:01 AM, David Dorwin <ddorwin@google.com<mailto:ddorwin@google.com>> wrote:
For Blink, we tried to follow our understanding of the WPT style, which was that each test case be a separate file. In some cases, especially the syntax tests, there are multiple categories of tests together. I think readability is also important, even if it means duplication. (Of course, API changes or refactorings can be monotonous when they have to be applied to many files, but that should be rarer now.) As to which approach we take for new tests, I defer to the WPT experts.

I think we probably do want individual tests for various media types, etc. For example, downstream users (i.e. user agent vendors) should be able to say "I know I don't support foo, so all the "-foo.html" tests are expected to fail. For tests that aren't specifically about a type (or key system), the tests should select a supported one and execute the tests.

​Certainly, there need to be individual tests, but a single file can contain several tests. The test page reports for each file how many of the tests within passed and how many failed.​ In WebCrypto we have a file with 20,000 tests :-)

However, I do like that the blink tests are small and easy to read. Another reason is that the WPT framework has a 60s timeout for any given file. Since it takes a few seconds to start and verify playback, we can't have too many tests in one file unless we can adjust this timeout.

Ideally, we need to generalize on at least 5 axes, either by generalizing the tests as they are, or by creating new files with the different versions of each test:
- test all the media types the browser claims to support
- test all the initData types the browser claims to support
- test all the session types the browser claims to support
- test all the key systems the browser claims to support
- for cenc, test both keysystem-specific and common format initData

​We do not need to test every possible combination of the above and we don't need to run every one of the existing blink tests for each of these combinations. But it is not straightforward to work out which combinations we do need and which tests need to run on multiple combinations.

We perhaps need a utility function which calculates which combinations of the above a browser claims to support (as a subset of the combinations the test framework supports). There would then be one test which looks at the supported combinations and checks it is non-empty :-)

The list of supported combinations would then be an input to at least some of the other tests, which would then test each combination individually.


Ideally, it would be possible to force such tests to run all supported variants. For example, Chrome might want to run the tests with both MP4 and WebM. encrypted-media-syntax.html, for example, tries both WebM and/or CENC types based on whether they are supported, requires all supported to pass, and ensures that at least one was run. This has the advantage of testing both paths when supported, though it's not verifiable anywhere that both ran. I don't know whether it would be useful to be able to say run all the tests with WebM then repeat with CENC.

Regarding the test content, it would be nice to use a common set of keys across all the tests and formats. This will simplify utility functions, license servers, debuggin, etc. Also, we may want to keep the test files small.

For our part, we don't have a workflow to easily package content with a specific key / key id. There is test mp4 content, cropped to ~10 seconds, in the branch linked below. Do you have a way to create a WebM file with the same key / key id ? I guess we could then hard code all the Clear Key messages.

​...Mark​



David

On Tue, Jun 21, 2016 at 9:16 PM, Mark Watson <watsonm@netflix.com<mailto:watsonm@netflix.com>> wrote:
All,

I have uploaded some additional EME test cases here: https://github.com/mwatson2/web-platform-tests/tree/clearkey-success/encrypted-media


I have not created a pull request, because there is overlap with the Blink tests.

I have taken a slightly different approach, which is to define one function, eme_success, which can execute a variety of different test cases based on a config object passed in. There are currently only four: temporary / persistent-usage-record with different ordering of setMediaKeys and setting video.src, but it is easy to add more with different initData approaches, different media formats and different keysystems.

What approach do we want to take ? The Blink approach of a different file for every individual case will bloat as we add different session types, initData types, media formats and keysystems.

On the other hand, each of the Blink test cases is very straightforward to follow, whereas the combined one is less so.

My branch also includes some mp4 test content, the key for which is in the clearkeysuccess.html file.

...Mark

Received on Tuesday, 5 July 2016 23:47:43 UTC