Re: [Bug 17660] Alternate mechanism -- (was Re: Request to add parameters to createSession) from Joe Steele on 2012-11-01 (public-html-media@w3.org from November 2012)

From: Joe Steele <steele@adobe.com>
Date: Thu, 1 Nov 2012 11:44:46 -0700
To: David Dorwin <ddorwin@google.com>
CC: Mark Watson <watsonm@netflix.com>, Martin Soukup <martin.soukup@irdeto.com>, Steven Robertson <strobe@google.com>, "<public-html-media@w3.org>" <public-html-media@w3.org>
Message-ID: <9CB571F7-980F-436F-BA7D-B23903353752@adobe.com>

On Oct 31, 2012, at 3:53 PM, David Dorwin wrote:

On Wed, Oct 31, 2012 at 6:28 PM, Joe Steele <steele@adobe.com<mailto:steele@adobe.com>> wrote:
On Oct 31, 2012, at 9:36 AM, Mark Watson wrote:

Sent from my iPhone

On Oct 31, 2012, at 5:17 PM, "Joe Steele" <steele@adobe.com<mailto:steele@adobe.com>> wrote:

On Oct 31, 2012, at 8:39 AM, Mark Watson wrote:

On Oct 31, 2012, at 4:09 PM, Joe Steele wrote:

Nothing _has_ to be changed for this to work.

However for the reasons I stated below (uniformity, error handling) it would be useful to give some guidance about how to do this in the spec. This could be handled by a question/answer in the FAQ and some example code. This would indicate to developers that the app is allowed to intercept, modify and respond to the needKey requests in the manner described.

I think we'd have to see the text.

[steele] I will write something up and send it out for comment.

It is of course up to the application what to do with the keymessage. There's a desire for keysystem-independence in the client code - so I can write my client application to behave the same for all keysystems. I'm not sure I understand how I would use a keysystem of the kind you described together with other keysystems ?

The way we imagine our application working is that the user would already be "logged in" to our application and we would have cookies, tokens or whatever that ensure that log in is maintained through all server transactions. We'll just send the keymessage to our application servers and ask the appropriate keysystem-specific back-end to process it and give its a message to pass back.

If I understand rightly, you're considering a scenario where user login and authentication does not take place until playback is requested. The app needs to handle that before sending the keymessage to the application server - at least it does so for all other key systems, so it would probably like to do the same thing for this one.

[steele] Yes - that is essentially the scenario. In this scenario, we do not know what kind of authentication is needed if any, prior to starting the key request cycle with the CDM. Furthermore, any authentication that the application has done with the application server may be irrelevant if the license server uses a different authentication mechanism (which is not uncommon).

If the keysystem needs this authentication information, wouldn't it be possible for the back-end keysystem-specific system to place it in the message that it returns ?

[steele] Yes -- the request for additional information (authentication is one example) could come in the initData or as a result of an addKey. This is where the problem comes in. The only mechanism for the CDM to signal it needs additional information is via the needkey event. And the implication from the documentation is that the appropriate response to a needkey event is _always_ to send a network message to the destinationURI.
destinationURL is informative. The draft says "An application may override this."

[steele] This text is not clear to me. I read this as "the application may replace the URI with it's own". Which I don't believe is what you intend. I think what you should say here is - "the application may choose how to use this value. For example, this value can be used as part of a network request."

The response to a needkey is to create a key session with the initData. That does not require server interaction.

[steele] You are correct -- I should have said the "keymessage" event not the "needkey" event. Which is currently implied to need network interaction.

Sending it to a server is the most common case, but it's not required. For instance, Example 8.1 doesn't use the network at all. There are expected/common use cases, but the application behavior is generally not defined. As Mark has mentioned, following these use cases will make it easier for application developers and content providers, but they are not the only way to use the APIs.

[steele] You are absolutely right. I have ignored that example multiple times because of the text in the title "(Clear Key Encryption)". Can we move that into the body of the example? That example completely ignores destinationURI but it does provide the direct response I am looking for. I can live without an explicit example of the application parsing the destinationURI directly.

My point is that there are cases where the application can and should provide the information directly - not as a result of a network request. And I would like to make that clear in the spec.

We don't specify how application functions are distributed. When you do get to the key message stage, this message needs to be passed to some key-system-specific application function. If you have an objective of keysystem-independent client code, that function is necessarily on the server. But if accept keysystem-specific functionality on the client side, then that function can be on the client.

I think what you are describing is something like a keysystem-aware client side keymessage proxy, that will pass some keymessages up to the sever and handle others locally.

[steele] Exactly

There's nothing we need to say in the spec to enable this and we could certainly clarify the point above that we don't specify the functional distribution of applications (we can't).

[steele] Clarification is exactly what I am asking for. Some text along these lines would work for me --

"The key acquisition process may involve the web page handling keymessage<http://dvcs.w3.org/hg/html-media/raw-file/tip/encrypted-media/encrypted-media.html#dom-keymessage> events ++ by calling addKey with a response immediately, or ++ by sending the message to a Key System-specific service and calling addKey<http://dvcs.w3.org/hg/html-media/raw-file/tip/encrypted-media/encrypted-media.html#dom-addkey> with the response message."

In a related issue -- I would also like to see this note removed from the createSession() method --

"Note: cdm must not use any data, including media data<http://dev.w3.org/html5/spec/video.html#media-data>, not provided via initData.".

This is too restrictive, since the CDM may do things like signing the key request with an embedded key, which would add data not provided by initData. I can enter a separate bug for this, but I thought I would bring it up here because this thread gives you the right context to think about why.

As mentioned in issue 19805, "data" was intended to refer to stream-specific data. We can try to find some less-limiting wording.

But I would caution against designing a keysystem that *requires* client-side keysystem-specific code, because you are asking customers of that keysystem to lock themselves in to that keysystem.

...Mark

Or are you considering a case where the application code and server architecture is specific to one keysystem ?

Sorry if this is repeating some of the earlier discussion, but I'm not sure I understand how these keysystem-specific client behaviors would sit beside the normal keysystem-independent ones.

…Mark

Joe Steele
steele@adobe.com<mailto:steele@adobe.com>

On Oct 30, 2012, at 2:02 PM, Mark Watson wrote:

Sent from my iPhone

On Oct 30, 2012, at 9:16 PM, "Joe Steele" <steele@adobe.com<mailto:steele@adobe.com>> wrote:

Let me start a side thread on this with an alternate proposal.

What if instead of adding a new parameter to createSession to allow for application data to be passed to the CDM, we explicitly allow for the CDM to request data directly from the application via an established URI scheme?

Take this example:
A media source is loaded and the media stack decides it needs a key. The application decides that it wants to use the "com.foo.keysystem" keysystem. It calls createSession, passing the initData from the media stack. The CDM then examines the initData and determines that an in-band authentication needs to occur. The CDM then fires a needkey event setting the destinationURI to be something like "app://com.foo.keysystem?username&password". The app is watching for destinationURIs beginning with "app://" which it then handles directly rather than resulting in an network request. The app parses the URI into the keysystem and the request portions. The app then decides how it wants to respond to the key request and returns the information requested via the addKey method.

This would put some of the burden back on the app developers, in terms of possibly needing to encode multiple pieces of information into the single key parameter. However it would be better than my current proposal in the sense that the app needs to have less information when createSession is called. And I think it would satisfy the concerns about fragmentation since only the CDMs that want this behavior would generate key requests like this. I would prefer codifying the scheme that will trigger app handling of the URI (e.g. "app://" followed by the keysystem) to enforce some level of uniformity and allow for app developers to display reasonable error messages should the CDM do something they are not expecting.

This would address the issue I am concerned about. Would this be an acceptable alternative?

Assuming this is acceptable -- how can we reflect this in the spec?

Does anything need to be changed in the spec to support what you describe above ?

...Mark

Joe Steele
steele@adobe.com<mailto:steele@adobe.com>

Received on Thursday, 1 November 2012 18:46:08 UTC