Re: Additional EME tests from Mark Watson on 2016-06-25 (public-html-media@w3.org from June 2016)

From: Mark Watson <watsonm@netflix.com>
Date: Fri, 24 Jun 2016 17:45:22 -0700
To: Paul Cotton <Paul.Cotton@microsoft.com>
Cc: David Dorwin <ddorwin@google.com>, Greg Rutz <G.Rutz@cablelabs.com>, "public-html-media@w3.org" <public-html-media@w3.org>, Francois Daoust <fd@w3.org>, Philippe Le Hégaret <plh@w3.org>, John Rummell <jrummell@google.com>, Ralph Brown <r.brown@cablelabs.com>, John Simmons <johnsim@microsoft.com>, Iraj Sodagar <irajs@microsoft.com>
Message-ID: <CAEnTvdC3boNecpgj7ZavnM_mPY+fVohRq6gQTPc_1OVFKbkWaw@mail.gmail.com>
All,

I'm away for the coming week, but here are some things I have been
tinkering with that I hope may be useful:

The clearkeysuccess.html test here
<https://github.com/mwatson2/web-platform-tests/tree/clearkey-success/encrypted-media>
now
passes for all session types on Mac Chrome and Firefox. I am not suggesting
that we use this test as is, since this discussion is leaning toward more
granular tests and a different approach to combinations, but it may be
useful to have something working to compare with / steal from.

To make this work, I had to provide three further scripts:

   - chrome-polyfill.js
   <https://github.com/mwatson2/web-platform-tests/blob/clearkey-success/encrypted-media/chrome-polyfill.js>
   fixes a small bug
   <https://bugs.chromium.org/p/chromium/issues/detail?id=622956> in Chrome
   related to a missing keystatuschange message
   - firefox-polyfill.js
   <https://github.com/mwatson2/web-platform-tests/blob/clearkey-success/encrypted-media/firefox-polyfill.js>
   fixes a small bug <https://bugzilla.mozilla.org/show_bug.cgi?id=1282142>
   in Firefox related to requestMediaKeySystemAccess
   - clearkey-polyfill.js
   <https://github.com/mwatson2/web-platform-tests/blob/clearkey-success/encrypted-media/clearkey-polyfill.js>
   provides support for persistent sessions for Clear Key on browsers that do
   not support them natively

I am not yet testing reloading of a persistent session after it has been
closed, but I'd be happy to add that when I get back.

Hope this is useful.

...Mark



On Fri, Jun 24, 2016 at 7:39 AM, Paul Cotton <Paul.Cotton@microsoft.com>
wrote:

> I am moving this EME Editor-only discussion to public-html-media@w3.org
> so that all HME WG members are aware of the discussions about an EME test
> suite.  Please continue the discussion on this thread.
>
>
>
> FTR the original discussion can be found starting at:
>
> https://lists.w3.org/Archives/Public/public-hme-editors/2016Jun/0100.html
>
>
>
> /paulc
>
> HME WG Chair
>
>
>
>
>
> *From:* David Dorwin [mailto:ddorwin@google.com]
> *Sent:* Thursday, June 23, 2016 1:46 PM
> *To:* Greg Rutz <G.Rutz@cablelabs.com>
> *Cc:* Mark Watson <watsonm@netflix.com>; public-hme-editors@w3.org;
> Francois Daoust <fd@w3.org>; Philippe Le Hégaret <plh@w3.org>; John
> Rummell <jrummell@google.com>; Ralph Brown <r.brown@cablelabs.com>
> *Subject:* Re: Additional EME tests
>
>
>
> We have tools to generate WebM files, which are DRM-independent. There are
> two included in the Google directory: test-encrypted.webm
> and test-encrypted-different-av-keys.webm. The key IDs and keys are in
> encrypted-media-playback-two-videos.html
> and encrypted-media-playback-multiple-sessions.html, respectively. For
> simplicity, I suggest using these for the encryption of CENC files. (We
> also have some CENC files, encrypted with these keys but I believe they
> only have initData for the common format and Widevine.) We should move the
> key IDs and keys to a common location, though. (The initData values in
> encrypted-media-utils.js appear to contain different dummy values.)
>
>
>
> Regarding Mark's comments about combinations, I don't think there is much
> allowance/expectation for variance in what combinations are supported,
> mainly because this would break the usability of the API. For example, if
> two containers are supported but only one supported a session type, a
> configuration containing the combination would be rejected. initData types
> within media are also constrained, but implementations are required to
> support generateRequest() of all supported initDataTypes regardless of the
> actual media.
>
>
>
> I agree that we need to detect what is supported, at least for the simple
> spec test case, using some utility function. The Blink tests
> uses getSupportedInitDataType(), etc., though there is probably room for
> improvement. See also my comments inline Francois's reply, which I've
> copied here to unfork the thread.
>
>
>
> On Thu, Jun 23, 2016 at 1:48 AM, Francois Daoust <fd@w3.org> wrote:
>
> Hi David,
>
> I've been wondering about the same things for the MSE test suite. Some
> comments inline.
>
>
> Le 23/06/2016 09:01, David Dorwin a écrit :
>
> For Blink, we tried to follow our understanding of the WPT style, which
> was that each test case be a separate file. In some cases, especially
> the syntax tests, there are multiple categories of tests together. I
> think readability is also important, even if it means duplication. (Of
> course, API changes or refactorings can be monotonous when they have to
> be applied to many files, but that should be rarer now.) As to which
> approach we take for new tests, I defer to the WPT experts.
>
>
> I don't qualify as WPT expert, but my understanding is that it is somewhat
> up to the people who write and review the tests. In the MSE test suite, a
> given test file often checks a particular algorithm and contains multiple
> test cases to check the different steps. I personally find that approach
> useful and readable as well.
>
> I think we probably do want individual tests for various media types,
> etc. For example, downstream users (i.e. user agent vendors) should be
> able to say "I know I don't support foo, so all the "-foo.html" tests
> are expected to fail. For tests that aren't specifically about a type
> (or key system), the tests should select a supported one and execute the
> tests.
>
>
> I quickly glanced at the HTML test suite for media elements to see how
> tests were written there:
>
> https://github.com/w3c/web-platform-tests/tree/master/html/semantics/embedded-content/media-elements
>
> Most test files seem to pick up a supported MIME type, using common
> functions defined in:
> https://github.com/w3c/web-platform-tests/blob/master/common/media.js
>
> There are exceptions to the rule, such as tests on the "canPlayType"
> method that contain test cases explicitly marked as "(optional)":
>
> http://w3c-test.org/html/semantics/embedded-content/media-elements/mime-types/canPlayType.html
>
> For MSE, most tests can be written without having to impose a particular
> MIME type (with a few exceptions as well, e.g. to test the "generate
> timestamps flag"), and it seems a good idea to keep the number of MIME-type
> specific tests minimal to improve the readability of the implementation
> report. Whenever possible, we need the MIME-agnostic version of the tests
> to assess the "at least two PASS" condition in the report.
>
> Ideally, it would be possible to force such tests to run all supported
> variants. For example, Chrome might want to run the tests with both MP4
> and WebM. encrypted-media-syntax.html, for example, tries both WebM
> and/or CENC types based on whether they are supported, requires all
> supported to pass, and ensures that at least one was run. This has the
> advantage of testing both paths when supported, though it's not
> verifiable anywhere that both ran. I don't know whether it would be
> useful to be able to say run all the tests with WebM then repeat with CENC.
>
>
> I've been wondering about that as well for MSE tests. Passing a test for a
> given MIME type does not necessarily imply that the test also passes if
> another supported MIME type gets used. It would make tests harder to write
> though (more error-prone, harder to debug, and slightly harder for user
> agent vendors to tell what failed in practice). It's often easier to create
> one test case per variant.
>
>
>
> After I sent this, I realized that example (encrypted-media-syntax.html)
> won't scale to larger tests, such as playback. It might be that this was
> the easiest way for us to add coverage without deciding on some larger
> infrastructure for running multiple variants.
>
>
> In the end, what could perhaps work is to create a
> "createGenericAndVariantTests" method which takes a list of variants as
> input, replaces the usual calls to "test" or "async_test", and generates a
> generic test case that picks up the first supported variant together with a
> set of variant test cases marked as optional that test the same thing for
> each and every variant.
>
>
>
> I agree that some way to run the tests with variants is probably the ideal
> mechanism for general tests. (I'd still like to have -keyids.html, etc.
> tests where we are specifically testing those capabilities.) I also agree
> that the tests should be easy to write and maintain. Another option would
> be to make it possible for the "runner" to override the types that are
> automatically selected by the utility function that picks that type to test
> (e.g. getSupportedInitDataType()).
>
>
> The generic test case would give the result needed for the implementation
> report. The additional optional test cases could help user agent vendors
> detect additional issues with a particular variant and such tests should be
> easy to filter out from the implementation report as needed if they are
> consistently flagged with "(optional)".
>
> Francois.
>
>
>
> On Thu, Jun 23, 2016 at 10:08 AM, Greg Rutz <G.Rutz@cablelabs.com> wrote:
>
> I have a toolchain that can generate MP4 CENC content with multiple DRMs
> using the CastLabs DRMToday <http://drmtoday.com/> service.  With these
> tools I can select my own key/keyID, encrypt the content and ingest the key
> into the DRMToday license server (as many of you know, CastLabs has
> graciously agreed to provide an account to facilitate the W3C EME testing
> platform).  In addition, I have a very simple proxy server (required by the
> DRMToday architecture to sign license requests on behalf of the account
> owner) which can assign “rights" to each key to provide a customized
> license.  The system is quite flexible and we would be able to customize
> the rights for each test with only a single piece of content.  This may be
> valuable if we need to test key expiration or other rights-related
> operations that would be exposed to applications through the EME APIs.
>
>
>
> Please note that my toolchain has the following limitations and will
> require some development if we require more features:
>
>    - Only generates CENC initData (not DRM-specific variants).
>    - No WebM support
>    - ClearKey, PlayReady, Widevine, DRMs only.  Notable missing DRMs  —
>    Adobe Primetime, Apple FairPlay (both indicated as supported by CastLabs)
>
> CableLabs has volunteered my time to support the integration/use of these
> tools if we think they will be valuable.
>
>
>
> G
>
>
>
> On 6/23/16, 10:35 AM, "Mark Watson" <watsonm@netflix.com> wrote:
>
>
>
>
>
>
>
> On Thu, Jun 23, 2016 at 12:01 AM, David Dorwin <ddorwin@google.com> wrote:
>
> For Blink, we tried to follow our understanding of the WPT style, which
> was that each test case be a separate file. In some cases, especially the
> syntax tests, there are multiple categories of tests together. I think
> readability is also important, even if it means duplication. (Of course,
> API changes or refactorings can be monotonous when they have to be applied
> to many files, but that should be rarer now.) As to which approach we take
> for new tests, I defer to the WPT experts.
>
>
>
> I think we probably do want individual tests for various media types, etc.
> For example, downstream users (i.e. user agent vendors) should be able to
> say "I know I don't support foo, so all the "-foo.html" tests are expected
> to fail. For tests that aren't specifically about a type (or key system),
> the tests should select a supported one and execute the tests.
>
>
>
> Certainly, there need to be individual tests, but a single file can
> contain several tests. The test page reports for each file how many of the
> tests within passed and how many failed. In WebCrypto we have a file with
> 20,000 tests :-)
>
>
>
> However, I do like that the blink tests are small and easy to read.
> Another reason is that the WPT framework has a 60s timeout for any given
> file. Since it takes a few seconds to start and verify playback, we can't
> have too many tests in one file unless we can adjust this timeout.
>
>
>
> Ideally, we need to generalize on at least 5 axes, either by generalizing
> the tests as they are, or by creating new files with the different versions
> of each test:
>
> - test all the media types the browser claims to support
>
> - test all the initData types the browser claims to support
>
> - test all the session types the browser claims to support
>
> - test all the key systems the browser claims to support
>
> - for cenc, test both keysystem-specific and common format initData
>
>
>
> We do not need to test every possible combination of the above and we
> don't need to run every one of the existing blink tests for each of these
> combinations. But it is not straightforward to work out which combinations
> we do need and which tests need to run on multiple combinations.
>
>
>
> We perhaps need a utility function which calculates which combinations of
> the above a browser claims to support (as a subset of the combinations the
> test framework supports). There would then be one test which looks at the
> supported combinations and checks it is non-empty :-)
>
>
>
> The list of supported combinations would then be an input to at least some
> of the other tests, which would then test each combination individually.
>
>
>
>
>
> Ideally, it would be possible to force such tests to run all supported
> variants. For example, Chrome might want to run the tests with both MP4 and
> WebM. encrypted-media-syntax.html, for example, tries both WebM and/or CENC
> types based on whether they are supported, requires all supported to pass,
> and ensures that at least one was run. This has the advantage of testing
> both paths when supported, though it's not verifiable anywhere that both
> ran. I don't know whether it would be useful to be able to say run all the
> tests with WebM then repeat with CENC.
>
>
>
> Regarding the test content, it would be nice to use a common set of keys
> across all the tests and formats. This will simplify utility functions,
> license servers, debuggin, etc. Also, we may want to keep the test files
> small.
>
>
>
> For our part, we don't have a workflow to easily package content with a
> specific key / key id. There is test mp4 content, cropped to ~10 seconds,
> in the branch linked below. Do you have a way to create a WebM file with
> the same key / key id ? I guess we could then hard code all the Clear Key
> messages.
>
>
>
> ...Mark
>
>
>
>
>
>
>
> David
>
>
>
> On Tue, Jun 21, 2016 at 9:16 PM, Mark Watson <watsonm@netflix.com> wrote:
>
> All,
>
>
>
> I have uploaded some additional EME test cases here:
> https://github.com/mwatson2/web-platform-tests/tree/clearkey-success/encrypted-media
>
>
>
> I have not created a pull request, because there is overlap with the Blink
> tests.
>
>
>
> I have taken a slightly different approach, which is to define one
> function, eme_success, which can execute a variety of different test cases
> based on a config object passed in. There are currently only four:
> temporary / persistent-usage-record with different ordering of setMediaKeys
> and setting video.src, but it is easy to add more with different initData
> approaches, different media formats and different keysystems.
>
>
>
> What approach do we want to take ? The Blink approach of a different file
> for every individual case will bloat as we add different session types,
> initData types, media formats and keysystems.
>
>
>
> On the other hand, each of the Blink test cases is very straightforward to
> follow, whereas the combined one is less so.
>
>
>
> My branch also includes some mp4 test content, the key for which is in the
> clearkeysuccess.html file.
>
>
>
> ...Mark
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
Received on Saturday, 25 June 2016 00:45:57 UTC