Re: Additional EME tests

While reviewing some changes to the Blink tests, I developed an idea for
how we might handle tests that are not format-specific.

In the utils file, getSimpleConfiguration() returns a dictionary containing
all initDataTypes and an array of all audioCapabilities that contains at
least one entry that is supported on all user agents (i.e. WebM with Vorbis
and MP4 with common AAC).

For many of the tests, we can follow this pattern:

navigator.requestMediaKeySystemAccess(keySystem,
> getSimpleConfiguration())).then(function(access) {

    var initDataType = *access.initDataTypes[0]*;

    // Create objects and do something with initDataType...


We might need some other methods, such as to enable us to request exactly
"keyids" for tests  that rely on that.

For tests that require media, we can have getPlaybackConfiguration[s]()
that returns configurations for which we have media files. We would grab
the first supported one.

If we wanted to force particular format(s), we would just need to override
or trigger something in these helper functions to only return a single
values.


The remaining issue is supporting multiple key systems. Determining which
key system(s) are supported requires
calling navigator.requestMediaKeySystemAccess(<varies>,
getMinimalConfiguration()) repeatedly, something that is necessarily
asynchronous. If we wrap that in a promise-returning getKeySystem() helper,
we'd have:

*getKeySystem().then(function(keySystem) {*

    navigator.requestMediaKeySystemAccess(keySystem,
> getSimpleConfiguration())).then(function(access) {

        var initDataType = access.initDataTypes[0];

        // Create objects and do something with initDataType...


This will always work, but it's extra complexity in every test. It also
only allows testing the first key system supported. That seems fine for
spec tests since we shouldn't need multiple key systems from a single UA to
demonstrate interop. (Clear Key would be last in our list and Clear
Key-specific tests would not need this key system detection.)

On Thu, Jun 23, 2016 at 10:45 AM, David Dorwin <ddorwin@google.com> wrote:

> We have tools to generate WebM files, which are DRM-independent. There are
> two included in the Google directory: test-encrypted.webm
> and test-encrypted-different-av-keys.webm. The key IDs and keys are in
> encrypted-media-playback-two-videos.html
> and encrypted-media-playback-multiple-sessions.html, respectively. For
> simplicity, I suggest using these for the encryption of CENC files. (We
> also have some CENC files, encrypted with these keys but I believe they
> only have initData for the common format and Widevine.) We should move the
> key IDs and keys to a common location, though. (The initData values in
> encrypted-media-utils.js appear to contain different dummy values.)
>
> Regarding Mark's comments about combinations, I don't think there is much
> allowance/expectation for variance in what combinations are supported,
> mainly because this would break the usability of the API. For example, if
> two containers are supported but only one supported a session type, a
> configuration containing the combination would be rejected. initData types
> within media are also constrained, but implementations are required to
> support generateRequest() of all supported initDataTypes regardless of the
> actual media.
>
> I agree that we need to detect what is supported, at least for the simple
> spec test case, using some utility function. The Blink tests
> uses getSupportedInitDataType(), etc., though there is probably room for
> improvement. See also my comments inline Francois's reply, which I've
> copied here to unfork the thread.
>
> On Thu, Jun 23, 2016 at 1:48 AM, Francois Daoust <fd@w3.org> wrote:
>
>> Hi David,
>>
>> I've been wondering about the same things for the MSE test suite. Some
>> comments inline.
>>
>>
>> Le 23/06/2016 09:01, David Dorwin a écrit :
>>
>>> For Blink, we tried to follow our understanding of the WPT style, which
>>> was that each test case be a separate file. In some cases, especially
>>> the syntax tests, there are multiple categories of tests together. I
>>> think readability is also important, even if it means duplication. (Of
>>> course, API changes or refactorings can be monotonous when they have to
>>> be applied to many files, but that should be rarer now.) As to which
>>> approach we take for new tests, I defer to the WPT experts.
>>>
>>
>> I don't qualify as WPT expert, but my understanding is that it is
>> somewhat up to the people who write and review the tests. In the MSE test
>> suite, a given test file often checks a particular algorithm and contains
>> multiple test cases to check the different steps. I personally find that
>> approach useful and readable as well.
>>
>>
>> I think we probably do want individual tests for various media types,
>>> etc. For example, downstream users (i.e. user agent vendors) should be
>>> able to say "I know I don't support foo, so all the "-foo.html" tests
>>> are expected to fail. For tests that aren't specifically about a type
>>> (or key system), the tests should select a supported one and execute the
>>> tests.
>>>
>>
>> I quickly glanced at the HTML test suite for media elements to see how
>> tests were written there:
>>
>> https://github.com/w3c/web-platform-tests/tree/master/html/semantics/embedded-content/media-elements
>>
>> Most test files seem to pick up a supported MIME type, using common
>> functions defined in:
>> https://github.com/w3c/web-platform-tests/blob/master/common/media.js
>>
>> There are exceptions to the rule, such as tests on the "canPlayType"
>> method that contain test cases explicitly marked as "(optional)":
>>
>> http://w3c-test.org/html/semantics/embedded-content/media-elements/mime-types/canPlayType.html
>>
>> For MSE, most tests can be written without having to impose a particular
>> MIME type (with a few exceptions as well, e.g. to test the "generate
>> timestamps flag"), and it seems a good idea to keep the number of MIME-type
>> specific tests minimal to improve the readability of the implementation
>> report. Whenever possible, we need the MIME-agnostic version of the tests
>> to assess the "at least two PASS" condition in the report.
>>
>>
>> Ideally, it would be possible to force such tests to run all supported
>>> variants. For example, Chrome might want to run the tests with both MP4
>>> and WebM. encrypted-media-syntax.html, for example, tries both WebM
>>> and/or CENC types based on whether they are supported, requires all
>>> supported to pass, and ensures that at least one was run. This has the
>>> advantage of testing both paths when supported, though it's not
>>> verifiable anywhere that both ran. I don't know whether it would be
>>> useful to be able to say run all the tests with WebM then repeat with
>>> CENC.
>>>
>>
>> I've been wondering about that as well for MSE tests. Passing a test for
>> a given MIME type does not necessarily imply that the test also passes if
>> another supported MIME type gets used. It would make tests harder to write
>> though (more error-prone, harder to debug, and slightly harder for user
>> agent vendors to tell what failed in practice). It's often easier to create
>> one test case per variant.
>>
>
> After I sent this, I realized that example (encrypted-media-syntax.html)
> won't scale to larger tests, such as playback. It might be that this was
> the easiest way for us to add coverage without deciding on some larger
> infrastructure for running multiple variants.
>
>>
>> In the end, what could perhaps work is to create a
>> "createGenericAndVariantTests" method which takes a list of variants as
>> input, replaces the usual calls to "test" or "async_test", and generates a
>> generic test case that picks up the first supported variant together with a
>> set of variant test cases marked as optional that test the same thing for
>> each and every variant.
>>
>
> I agree that some way to run the tests with variants is probably the ideal
> mechanism for general tests. (I'd still like to have -keyids.html, etc.
> tests where we are specifically testing those capabilities.) I also agree
> that the tests should be easy to write and maintain. Another option would
> be to make it possible for the "runner" to override the types that are
> automatically selected by the utility function that picks that type to test
> (e.g. getSupportedInitDataType()).
>
>>
>> The generic test case would give the result needed for the implementation
>> report. The additional optional test cases could help user agent vendors
>> detect additional issues with a particular variant and such tests should be
>> easy to filter out from the implementation report as needed if they are
>> consistently flagged with "(optional)".
>>
>> Francois.
>>
>
> On Thu, Jun 23, 2016 at 10:08 AM, Greg Rutz <G.Rutz@cablelabs.com> wrote:
>
>> I have a toolchain that can generate MP4 CENC content with multiple DRMs
>> using the CastLabs DRMToday <http://drmtoday.com/> service.  With these
>> tools I can select my own key/keyID, encrypt the content and ingest the key
>> into the DRMToday license server (as many of you know, CastLabs has
>> graciously agreed to provide an account to facilitate the W3C EME testing
>> platform).  In addition, I have a very simple proxy server (required by the
>> DRMToday architecture to sign license requests on behalf of the account
>> owner) which can assign “rights" to each key to provide a customized
>> license.  The system is quite flexible and we would be able to customize
>> the rights for each test with only a single piece of content.  This may be
>> valuable if we need to test key expiration or other rights-related
>> operations that would be exposed to applications through the EME APIs.
>>
>> Please note that my toolchain has the following limitations and will
>> require some development if we require more features:
>>
>>    - Only generates CENC initData (not DRM-specific variants).
>>    - No WebM support
>>    - ClearKey, PlayReady, Widevine, DRMs only.  Notable missing DRMs  —
>>    Adobe Primetime, Apple FairPlay (both indicated as supported by CastLabs)
>>
>> CableLabs has volunteered my time to support the integration/use of these
>> tools if we think they will be valuable.
>>
>> G
>>
>> On 6/23/16, 10:35 AM, "Mark Watson" <watsonm@netflix.com> wrote:
>>
>>
>>
>> On Thu, Jun 23, 2016 at 12:01 AM, David Dorwin <ddorwin@google.com>
>> wrote:
>>
>>> For Blink, we tried to follow our understanding of the WPT style, which
>>> was that each test case be a separate file. In some cases, especially the
>>> syntax tests, there are multiple categories of tests together. I think
>>> readability is also important, even if it means duplication. (Of course,
>>> API changes or refactorings can be monotonous when they have to be applied
>>> to many files, but that should be rarer now.) As to which approach we take
>>> for new tests, I defer to the WPT experts.
>>>
>>> I think we probably do want individual tests for various media types,
>>> etc. For example, downstream users (i.e. user agent vendors) should be able
>>> to say "I know I don't support foo, so all the "-foo.html" tests are
>>> expected to fail. For tests that aren't specifically about a type (or key
>>> system), the tests should select a supported one and execute the tests.
>>>
>>
>> ​Certainly, there need to be individual tests, but a single file can
>> contain several tests. The test page reports for each file how many of the
>> tests within passed and how many failed.​ In WebCrypto we have a file with
>> 20,000 tests :-)
>>
>> However, I do like that the blink tests are small and easy to read.
>> Another reason is that the WPT framework has a 60s timeout for any given
>> file. Since it takes a few seconds to start and verify playback, we can't
>> have too many tests in one file unless we can adjust this timeout.
>>
>> Ideally, we need to generalize on at least 5 axes, either by generalizing
>> the tests as they are, or by creating new files with the different versions
>> of each test:
>> - test all the media types the browser claims to support
>> - test all the initData types the browser claims to support
>> - test all the session types the browser claims to support
>> - test all the key systems the browser claims to support
>> - for cenc, test both keysystem-specific and common format initData
>>
>> ​We do not need to test every possible combination of the above and we
>> don't need to run every one of the existing blink tests for each of these
>> combinations. But it is not straightforward to work out which combinations
>> we do need and which tests need to run on multiple combinations.
>>
>> We perhaps need a utility function which calculates which combinations of
>> the above a browser claims to support (as a subset of the combinations the
>> test framework supports). There would then be one test which looks at the
>> supported combinations and checks it is non-empty :-)
>>
>> The list of supported combinations would then be an input to at least
>> some of the other tests, which would then test each combination
>> individually.
>>
>>
>>>
>>> Ideally, it would be possible to force such tests to run all supported
>>> variants. For example, Chrome might want to run the tests with both MP4 and
>>> WebM. encrypted-media-syntax.html, for example, tries both WebM and/or CENC
>>> types based on whether they are supported, requires all supported to pass,
>>> and ensures that at least one was run. This has the advantage of testing
>>> both paths when supported, though it's not verifiable anywhere that both
>>> ran. I don't know whether it would be useful to be able to say run all the
>>> tests with WebM then repeat with CENC.
>>>
>>> Regarding the test content, it would be nice to use a common set of keys
>>> across all the tests and formats. This will simplify utility functions,
>>> license servers, debuggin, etc. Also, we may want to keep the test files
>>> small.
>>>
>>
>> For our part, we don't have a workflow to easily package content with a
>> specific key / key id. There is test mp4 content, cropped to ~10 seconds,
>> in the branch linked below. Do you have a way to create a WebM file with
>> the same key / key id ? I guess we could then hard code all the Clear Key
>> messages.
>>
>> ​...Mark​
>>
>>
>>
>>>
>>> David
>>>
>>> On Tue, Jun 21, 2016 at 9:16 PM, Mark Watson <watsonm@netflix.com>
>>> wrote:
>>>
>>>> All,
>>>>
>>>> I have uploaded some additional EME test cases here:
>>>> https://github.com/mwatson2/web-platform-tests/tree/clearkey-success/encrypted-media
>>>>
>>>> I have not created a pull request, because there is overlap with the
>>>> Blink tests.
>>>>
>>>> I have taken a slightly different approach, which is to define one
>>>> function, eme_success, which can execute a variety of different test cases
>>>> based on a config object passed in. There are currently only four:
>>>> temporary / persistent-usage-record with different ordering of setMediaKeys
>>>> and setting video.src, but it is easy to add more with different initData
>>>> approaches, different media formats and different keysystems.
>>>>
>>>> What approach do we want to take ? The Blink approach of a different
>>>> file for every individual case will bloat as we add different session
>>>> types, initData types, media formats and keysystems.
>>>>
>>>> On the other hand, each of the Blink test cases is very straightforward
>>>> to follow, whereas the combined one is less so.
>>>>
>>>> My branch also includes some mp4 test content, the key for which is in
>>>> the clearkeysuccess.html file.
>>>>
>>>> ...Mark
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>

Received on Friday, 24 June 2016 03:05:43 UTC