RE: EME Initialization Data Correlation

Hi!

I will first try to lay out my interpretation of the current EME draft and how it applies to this scenario, then offer some proposals for improvement.

The core question in this case appears to be what exactly should trigger a license request. EME provides us some mechanisms:

·         HTMLMediaElement.encrypted event is raised when Initialization Data is encountered in the stream.

·         MediaKeySession.keyStatuses provides information about keys known to the CDM.

·         MediaKeySession.generateRequest() enables the application to trigger a license request based on Initialization Data.

The application can use keyStatuses to determine whether a license request is needed, thus avoiding any potential unnecessary (and potentially failing) requests. Assuming it is even implemented (I seem to recall it being a recent addition).

However, there does appear to be one significant factor missing: nothing actually ensures that the application is informed of what keys are required to play back the content. Certainly, we cannot assume that the application is able to parse Initialization Data. The only real alternative that currently exists is to assume that we are dealing with DASH and that there is a ContentProtection element of the mp4protection type that provides a default_KID attribute.

This assumption seems like a stretch and nothing really mandates that the information is really present. Furthermore, if multiple keys are needed then one would need to correlate which key goes with which Initialization Data and woah, that is going down a very slippery slope. The application should deal with business logic, not with DRM details.

The above capabilities also fail to account for key rotation – what if the key changes mid-stream and a new license request is required? Okay, if the key information is sufficiently well encoded in the DASH manifest (new period at key change, with new ContentProtection default_KID), this is potentially doable (though the question of key ID and Initialization Data synchronization remains), but having EME depend on something totally external and nonmandatory like that seems undesirable.

All in all, I can definitely see how the current workflow makes life rather difficult for players. Therefore, I propose that we move part of this responsibility to the CDM. Looking back in history, I see that HTMLMediaElement.encrypted used to be called HTMLMediaElement.needkey, so I wonder if this topic has already been covered in the past? Would be very interested to learn of the history that brought about the current workflow.

The mainstream scenarios that I see as representative, as far as license acquisition is concerned, would be:

1.     Reactive license acquisition – the application wishes acquire a license when the CDM informs it that one is required (presumably as a response to receiving Initialization Data and determining that it does not posess one or more of the required keys).

2.     Proactive license acquisition – the application wishes to acquire a license regardless of the CDM’s opinion (e.g. perhaps business logic states that a new 48-hour license is to be acquired every 24 hours, leaving a 24 hour overlap and an extra 24 hours for use when not connected to the license server).

3.     Key rotation – the key changes in the middle of the stream and a new license may be required (or not). Reactive acquisition is the main point of interest here, with the assumption being that the scenario is not different from the previous step in case of proactive acquisition.

My proposal to enable this in a way that makes player development simple would be:

1.     Enable all Initialization Data to be directly fed to the CDM from within the stream and from outside it (e.g. the DASH manifest), for unspecified processing by the CDM.

a.    MediaKeySession.loadInitData(initData, initDataType) would enable it to be fed from the manifest.

b.    HTMLMediaElement.encrypted() -> MediaKeySession.loadInitData(initData, initDataType) would enable it to be fed from stream (potentially a direct feed could be mandated, one that would not require application action?).

2.     Enable the CDM to signal that a key is required, at any time. Presumably this would be in response to loadInitData() but this should not be mandated to allow flexible CDM implementation.

a.    MediaKeySession.OnKeyNeeded(keyId, initData) or similar – Initialization Data must be provided by the CDM even if the event is not in response to loadInitData, because generateRequest() requires Initialization Data. This would mean the CDM must synthesize the Initialization Data for any such case.

3.     Remove the requirement for Initialization Data to be static.

a.    Seems unnecessary in general, plus I don’t quite see how that would work with key rotation.

What would this achieve?

·         The application can be totally “dumb” when processing Initialization Data, simply feeding it to the CDM from the manifest and whenever any HTMLMediaElement.encrypted event is raised.

·         The application can easily implement reactive license acquisition by waiting for the CDM to detect that a key is required and then performing acquisition.

o    Some application-side bookkeeping is required, since the same key may be required for multiple streams, so presumably multiple keyNeeded events for the same key would be raised, which could all be satisfied by a single license request.

·         The application can still request licenses proactively, providing the Initialization Data and key ID (which it has to source itself but that’s expected for the proactive case).

·         Reactive acquisition in case of key rotation is enabled, as the CDM will simply detect and signal the key change (either in response to Initialization Data or its own logic). Proactive acquisition is not affected by key rotation, so all is well with that case.

This appears to greatly simplify application development and at the same time provide a clearer pattern for situations such as key rotation.


Cheers,

Sander Saares | Advisor
Axinom | Soola 8 | 51013 Tartu | Estonia
phone: +49 911 80109-54 | saares@axinom.com<mailto:saares@axinom.com>

Managing Directors: Sergei Gussev, Oleg Knut | District Court Tartu, Reg.11046287

From: Greg Rutz [mailto:G.Rutz@cablelabs.com]
Sent: Friday, October 9, 2015 7:32 PM
To: public-html-media@w3.org
Subject: EME Initialization Data Correlation

I’m the maintainer for the DRM support system in the dash.js<https://github.com/Dash-Industry-Forum/dash.js> open source, MSE/EME-based, MPEG-DASH player library.  A DRM-related bug<https://github.com/Dash-Industry-Forum/dash.js/issues/709> was filed recently against the player and I have some doubts about how to solve the problem.  The behavior of EME is somewhat responsible for the symptoms behind the problem and so I’m looking to this community to provide some guidance.

Initialization Data (PSSH) may be carried in both the DASH manifest and in the media stream, and for certain DRM systems, they may not be byte-for-byte copies of each other even though they represent the exact same keys/rights.  If the application can not discern the difference between two instances of Initialization Data, it can not prevent redundant requests to the license server.

Since the PSSH is almost always an opaque blob of data recognizable only by the CDM and license sever, should the EME spec mandate that the CDM only send ‘encrypted’ events when the CDM actually needs the application to take action?  Or is this something that the InitializationData-specific format specs need to provide? (how to exactly correlate Initialization data so the app knows when to discard ‘encrypted’ events)

Also, should the EME spec call out that CDMs should not produce an error if they are updated with key information that they already have?  I mention this because one of the major browsers does send such an error (which leads to the bug report described above).

=-=-=-=-=-=-=-=-=-=-=-=-=
Greg Rutz
Lead Architect
CableLabs
303-931-6769 (m)
303-661-3796 (o)

Received on Monday, 12 October 2015 06:35:13 UTC