Re: [EME] Separating key session creation from the media element (using a MediaKeySession constructor) from Lewis, Jason on 2012-06-29 (public-html-media@w3.org from June 2012)

From: Lewis, Jason <Jason.Lewis@disney.com>
Date: Thu, 28 Jun 2012 21:39:53 -0700
To: Mark Watson <watsonm@netflix.com>, David Dorwin <ddorwin@google.com>, "public-html-media@w3.org" <public-html-media@w3.org>
Message-ID: <CC127EF5.4DAB0%jason.lewis@disney.com>
David, I'd like to voice my support of this new approach as well.  The second variant seems cleaner and makes more sense to me; if I was writing an application against this API.
-- jason lewis

From: Mark Watson <watsonm@netflix.com<mailto:watsonm@netflix.com>>
Date: Thu, 28 Jun 2012 13:13:29 -0700
To: David Dorwin <ddorwin@google.com<mailto:ddorwin@google.com>>, "public-html-media@w3.org<mailto:public-html-media@w3.org>" <public-html-media@w3.org<mailto:public-html-media@w3.org>>
Subject: RE: [EME] Separating key session creation from the media element (using a MediaKeySession constructor)

Hi David,

Thanks for writing this up and the detailed analysis.

I like this approach, specifically the second variant (which I think aligns with the conclusion that a session should receive a single initData value).

Regarding the relationship with the key release parts, I agree it can simplify that as well. The MediaKeySession object would need a release() method, which would cause it to generate the 'proof of key release' message and wait for the 'server ack of proof of key release' message to come back. (Obviously, if it had previously received the ack then it wouldn't send the 'proof of key release'.) We could use the existing keymessage and addKey() mechanisms for message passing (I am still hoping for a better name than addKey() ...), but according to our earlier discussion I think we need a synchronous version so this can be done from unload events.

The one additional thing, though, is that the CDM is expected to hang on to 'proofs of key release' until they are ack-ed - even for days or weeks. So there would be a need for the app to get MediaKeySession objects for all the un-acked proofs of key release. An app wil generally do this on startup, to clear out any that are left from previous sessions that did not close gracefully. It would be given MediaKeySession objects for all the sessions created by the same origin that have not had their proof of key release acked.

...Mark

________________________________
From: David Dorwin [ddorwin@google.com<mailto:ddorwin@google.com>]
Sent: Tuesday, June 26, 2012 6:29 PM
To: public-html-media@w3.org<mailto:public-html-media@w3.org>
Subject: [EME] Separating key session creation from the media element (using a MediaKeySession constructor)

In today's teleconference [1], we decided to create a session object [2] with video.generateKeyRequest() rather than video.createKeySession() or something similar. While working on the change proposal, I realized that there are good reasons to consider separating object creation from the media element (something we had not discussed). I think the potential benefits are compelling enough to revisit this issue. Hopefully we can have a quick discussion in email and move forward with one of these solutions. If you are interested in the API design, please provide feedback and/or state your preference soon. In parallel, I'll continue working on the change proposal (ETA is now tomorrow).

The new option is to create session objects using "new MediaKeySession(keySystem, mimeType)". For example:
  function handleNeedKey(event) {
    var session = new MediaKeySession(keySystem, mimeType);
    if (session) {
      session.onkeymessage = handleKeyMessage;
      session.onkeyerror = handleKeyError;
      session.generateKeyRequest(initData);
      var video = event.target;
      video.addKeySession(session);
    }
  }

Another variant would be to add initData to the MediaKeySession constructor and specify that the constructor generates a key request. This would eliminate the need for generateKeyRequest() and have most of the benefits of the video.generateKeyRequest() solution discussed in the teleconference. For example:
  function handleNeedKey(event) {
    var session = new MediaKeySession(keySystem, mimeType, initData);
    if (session) {
      session.onkeymessage = handleKeyMessage;
      session.onkeyerror = handleKeyError;
      var video = event.target;
      video.addKeySession(session);
    }
  }

Advantages of this approach

 1.  This might make more sense if we eventually decide to support sharing sessions between media elements (https://www.w3.org/Bugs/Public/show_bug.cgi?id=16615 and https://www.w3.org/Bugs/Public/show_bug.cgi?id=17202).
    *   Any of the "session = video.foo()" solutions result in an implicit relationship.
 2.  It might enable initiating key exchange before the media element starts loading. [3]
 3.  The resulting object could be used for Key Release (https://www.w3.org/Bugs/Public/show_bug.cgi?id=17199) without needing to create a "dummy" media element (http://lists.w3.org/Archives/Public/public-html-media/2012Jun/0107.html) or defining a separate object just for key release (http://dvcs.w3.org/hg/html-media/raw-file/tip/encrypted-media/encrypted-media.html#key-release-manager).
 4.  MediaSource is using this model ("new MediaSource"; http://lists.w3.org/Archives/Public/public-html-media/2012Jun/0071.html), and it would be nice to have consistency.
 5.  In the first variant only, explicit separation of creation from actions.

Effects of this approach
The primary effect is that the session object is not implicitly associated with a media element.

Because initData should contain all the information (i.e. the appropriate ISO CENC PSSH) necessary to obtain a license and verify that the specified key system is supported, the session object should not need to be associated with a specific element (or source file/stream) until its keys are needed to decrypt content.

However, the separation does require the following:

 *   The MIME type must be explicitly specified to the session object.
    *   This is similar to the MediaSource constructor, which takes a type string.
    *   For the case where the object is created in response to a needkey event, we could even provide the current MIME type as an attribute of the event.
    *   Since the MIME type is specified separately, the media element would need to fail addKeySession() or only use a session if the types were compatible.
    *   (This would be required anyway if we wanted to address advantage #2 above. [3])
 *   Session objects must be implicitly or explicitly associated with media element(s).
    *   This is similar to MediaSource, which provides a URL to video.src.
    *   Implicit example: All media elements may use all sessions.
    *   Explicit example: video.addKeySession(session);

Compared to the video.generateKeyRequest() solution:

 *   The second variant (constructor generates a key request) is very similar.
 *   The first variant (constructor does NOT generate a key request):
    *   Requires one or two more lines of application code during initialization.
    *   Requires an object(s) must be created first for each KeySystem in the use case where generateKeyRequest(keySystem, initData) can be called repeatedly until a supported combination is found.
       *   This seems accpetable. [4]
    *   No longer implicitly enforces that generateKeyRequest() is called before addKey().
       *   We could just choose not to enforce this but instead require that all key systems support generateKeyRequest() by returning a keymessage so that applications can always follow this pattern. That was the main goal of this requirement anyway.
    *   No longer implicitly enforces that a session object represents exactly one initData value.
       *   Instead, subsequent generateKeyRequest() calls should explicitly fail, at least if a previous call was successful.


[1] http://www.w3.org/2012/06/26-html-media-minutes.html

[2] See http://lists.w3.org/Archives/Public/public-html-media/2012Jun/0054.html and https://www.w3.org/Bugs/Public/show_bug.cgi?id=16613

[3] Currently, the media element must have started loading before generateKeyRequest() can be called. WebKit, at least, does not create the underlying media engine until loading starts. Even if we could work around this, the media element would need to somehow know the MIME type it will be provided so that the user agent or the CDM can parse the initData.  If we separate the CDM from the media element, implementations should be able to create the underlying objects immediately regardless of the media implementation.

[4] If the combinations passed to generateKeyRequest() fail

 *   Synchronously:
    *   Creating an object to call this function doesn't seem too bad for the application. It should just be an extra line of code
    *   Note: Failing synchronously in any design may require the user agent to be able to parse at least part of the initData and determine whether it contains data for one of the key systems it supports.
    *   Assuming that creation of the session object causes the related CDM to spin up, CDMs may be spun up unnecessarily in the case that the generateKeyRequest() call fails. Fortunately, this would happen asynchronous to the application and should not affect performance.
 *   Asynchronously:
    *   Creating an object is likely to be a very minor part of the logic required to support this.
Received on Friday, 29 June 2012 04:40:16 UTC