Re: [EME] Switching decoders when the key system is specified from Steven Robertson on 2012-10-03 (public-html-media@w3.org from October 2012)

From: Steven Robertson <strobe@google.com>
Date: Wed, 3 Oct 2012 16:09:29 -0700
To: Aaron Colwell <acolwell@google.com>
Cc: David Dorwin <ddorwin@google.com>, "<public-html-media@w3.org>" <public-html-media@w3.org>
Message-ID: <CAJtuSCvGBBUZGbJiX-uJTOxbNGu2DFBNZK0eo0bXwhGH+duE-A@mail.gmail.com>
CIL. The gist of it is that I like the conceptual purity of making sure
applications have as few requirements to worry about as possible with
regard to changing key systems, but I feel that the tradeoff in terms of
ubiquity and compatibility of implementations is too large to allow this.

EME with this restriction (along with MSE) is attainable for CE systems
integrators in this and the next few generations because it's mostly a
browser level for current devices. Doing EME to spec without any of the
proposed restrictions would require end-to-end integration for device
makers, which for them is four or five suppliers deep.

On Wed, Oct 3, 2012 at 1:09 PM, Aaron Colwell <acolwell@google.com> wrote:

> I don't think a switch in the middle of a GOP would be a problem. In
>
the case of MSE at least, encryption changes requires an init segment
> and then a new media segment to get appended. Since media segments are
> required to start at a random access point you'll always have a
> keyframe so initializing a new decoder shouldn't be a problem.
>

I think this might be a little optimistic. On CE devices, a key system
switch can entail things like switching the media processor used (yep,
"processor": some implementations run trusted media decode on a separate
core from normal media decode, which may be on a separate package from a
different supplier), renegotiating display connection for HDCP and
disabling analog outputs, disabling audio mixing so that audio can be
forwarded with encryption, and tearing down the read-write graphics stack
and replacing it with one that treats all operations as write-only.

Most devices have only one CDM, but those CDMs usually have a more limited
format support and impose restrictions on the platform when in use, so
running media through the key system all the time would have a high cost
too.


> >> Some options:
> >>
> >> 1. Require the key system to be specified before loading and/or decoding
> >> starts. If it is not specified by this time, it cannot be set later,
> meaning
> >> decryption would not be possible. This would likely reduce the utility
> of
> >> the needkey event.
>
> I think this is too restrictive and I don't really support it.
>

This restriction, or 5. or 6., is likely a needed compromise to earn broad
adoption on TVs in this upcoming product cycle.


> >> 5. Switch immediately and drop frames if necessary.
>
> I don't really understand what this is proposing, but it feels like a
> quality of implementation issue.
>
> >> 6. Suggest the above to applications and make it a quality of
> implementation
> >> issue for applications.
>
> I think switches between encrypted & non-encrypted content should be
> allowed. How well different scenarios work can be a quality of
> implementation issue. For example I think the two following situations
> should be allowed:
>
> Scenario 1: Early notification of encryption
> 1. append init segment that signals a key system
> 2. append init segment that signals unencrypted content
> 3. append media segments with unencrypted content
> 4. append init segment that signals a key system
> 5. append media segments with encrypted content
>
> Scenario 2: Just-in-time notification of encryption
> 1. append init segment that signals unencrypted content
> 2. append media segments with unencrypted content
> 3. append init segment that signals a key system
> 4. append media segments with encrypted content
>
> I could see some implementations supporting Scenario 1 slightly better
> than Scenario 2  because it allows the UA to start the "needkey" dance
> earlier and, depending on the UA's media pipeline implementation,
> could avoid a decoder reinitialization. I don't think this means that
> we should prevent Scenario 2 though.
>

YouTube will actually require support for clear-start with Media Source, in
order to cover up key exchange latency. However, we're guaranteeing that
our app will provide an indication of the selected key system before the
transition to NETWORK_LOADING.

If an application is generating some sort of dynamic playlist, it may
> not know whether encryption will eventually be part of the
> presentation at the start. It may be appending content far enough
> ahead of the current playback position though so there will be plenty
> of time for the "needkey" handshake to happen before the content is
> actually played. If this isn't done quick enough then playback should
> stall until the UA has the keys it needs to continue. This behavior
> should be incentive enough for the application to notify the UA of
> encryption initialization segments as early as possible.
>

Not sure how much of a use-case there is for third-party remixing of
protected content; seems to me that those concepts are mostly orthogonal
due to licensing restrictions. If a content provider has the licenses to
enable remixing of protected content, they can probably distribute that
content with support for the same key system.

>> 7. Leave the behavior undefined, making it a quality of implementation
> issue
> >> for user agents.
>
> I don't think this should be left completely undefined since that
> would likely cause interoperability problems. I think a reasonable
> compromise is to allow switches between unencrypted/encrypted content,
> but limit it to a single key system.
>

Agreed heartily that this shouldn't be left undefined.

Thanks,
Steve
Received on Wednesday, 3 October 2012 23:10:37 UTC