Re: [SUMMARY] Do we need loadSession? from Joe Steele on 2014-08-20 (public-html-media@w3.org from August 2014)

From: Joe Steele <steele@adobe.com>
Date: Wed, 20 Aug 2014 22:46:07 +0000
To: Mark Watson <watsonm@netflix.com>
CC: David Dorwin <ddorwin@google.com>, "<public-html-media@w3.org>" <public-html-media@w3.org>, "Jerry Smith (WINDOWS)" <jdsmith@microsoft.com>
Message-ID: <9B3E185A-B767-4468-AC7B-00AC238717F1@adobe.com>
On Aug 19, 2014, at 4:58 PM, Mark Watson <watsonm@netflix.com> wrote:

> 
> On Tue, Aug 19, 2014 at 3:26 PM, Joe Steele <steele@adobe.com> wrote:
> Inline — 
> 
> Joe
> 
> On Aug 18, 2014, at 6:50 PM, David Dorwin <ddorwin@google.com> wrote:
> 
>> Thanks, Joe. This is helpful.
>> 
>> In all of these cases, it seems the application is involved in making this decision. In #1, the user (via the app) "pins" the content. In #2 (not combined with #3), the app may proactively acquire licenses on the user's behalf. In #3, the app could proactively request a license containing all the keys.
> 
> For #1 you are correct, this is typically something driven by the application, since only the user knows when an offline period is imminent. The exception is when apps do this all the time.
> 
> For #2 and #3 the application is typically not involved in the decision at all. The content publisher knows the characteristics of the content and the expected server load, and makes the decision about how to issue keys for this content before the application is ever launched. 
> 
>> 
>> From an EME POV, #3 is probably most naturally accomplished by requesting separate licenses for each title, but I can see how that might be counterproductive to the goal of reducing license server load. Creating a session with initData containing key IDs (as instructed by the page/server) for each title might reduce server load (in terms of raw number of requests). A similar effect could be achieved on the license server, but that seems like the wrong place, especially if the service wants to optimize across clients.
>> 
>> If it is reasonable for the application to request (or at least be aware that it is requesting) persisted licenses, the questions are:
>> How should the application make the request?
> 
> I think a signal in the createSession() call that tells the CDM that the application will allow (or not allow) persistent licenses would work well. That would allow for the license server to provide persistent licenses and the CDM to make the call as to whether to persist them or not based on license server policy and input from the application. 
>> How should the application later retrieve (and eventually remove) such licenses?
> 
> I think the natural way for applications to use licenses (not sure what you mean by retrieve) is for the application to provide the metadata for the content it is trying to play and the CDM to make the determination of whether it has the licenses on hand or needs to request new ones. It would be good to also allow the application to constrain the CDM via createSession params to only using licenses on hand or only requesting new licenses.
> 
> I think the natural way for applications to remove licenses in use during a session is to make a removeLicenses() call. This should have the effect of removing any licenses (persistent or otherwise) used for that content stream. In order to handle the case where the licenses were not properly removed (e.g. when a session was ended abruptly), the application can simply call createSession with a signal to the CDM not to request new licenses. Then call removeLicenses() on the new session - which should remove any licenses that were still valid. 
> 
> I think I have seen multiple incompatible alternatives described as "natural" in this debate. I'm not sure it's a very strong argument, as what's probably meant is "natural consequence of a bunch of unstated assumptions / expectations / objectives / implementation constraints". Or at least, what's "natural" to one person may be "unnatural" to another.

[steele] Point taken.

> 
> On this question, like pretty much anything in this API, we are plotting a course between two extremes: on the one hand we could have had an entirely context-free, semantics-free, CDM-specific message passing API, rather like what was done in the Hbb specs and similar. You'd need a bunch of CDM-specific application code to do anything and no two CDMs would work the same way. On the other hand, we could completely define CDM functionality and behaviour, rigorously define what is in a license and the functions it drives and only leave out the actual technology used to secure the protocol and encode the messages.
> 
> I'd suggest that the only reason for not following the latter approach is a desire to retain support for this effort from multiple UA / DRM vendors. That is, to accommodate existing DRMs without significant modification.

[steele] There is some truth to this, but it is not the only reason. In order to more rigidly define the CDM, the license contents, the license request protocol, etc. we would need to come to agreement on a good (or good enough) definition for these things. See your comment above on what is “natural”. Happy to discuss more on a different thread though. 

> 
> The specific question at hand is the extent to which we explicitly model the handling of persisted information - both licenses themselves and the evidence of key removal: should that aspect of CDM functionality be opaque to the application, happening automatically as a result of CDM interaction with its server peer, or should it be explicitly visible to and controlled by the application (of course with CDM policing things) ?
> 
> [Note that we may have a disconnnect as to what we mean by "application", since I was completely thrown by a reference to things happening 'before the application is even launched'. I don't see how that makes sense at all. I consider the application to be launched when you visit netflix.com - before this there is no CDM loaded at all.]

[steele] By “before the application is launched” - I did mean before the CDM is launched. The content publisher makes decisions about how to package the content for delivery prior to anyone actually discovering it via the site. For example: content policies were decided, content may have been encrypted, content keys may have been pushed to a key server, etc. These decisions will impact the expected server load and the start of playback time.

> 
> It seems to me that the application should be able to track or easily determine what content it has / had keys for and what it does not. For example, the application may want to pre-fetch keys for content items it identifies as likely to be played, but that is not necessary if they keys are already available. It would be nice to know this without async requests, but if requests are required then they should return an answer quickly in both cases. That suggests we should have a deterministic way to discover whether or not a key exchange is necessary for a given piece of content. But both models provide that: loadSession() will fail if the session is not in fact still available and a new session created with createSession() will either immediately indicate usable key ids, fire a keymessage or fire a keys change.

[steele] I agree the application may need to be able to determine whether it has keys available already. It does not follow in my mind that it needs to be done by loading old sessions. Instead of the application tracking session IDs (which the underlying DRM is not required to understand), the application should be tracking key IDs and/or initData (which the underlying DRM is required to understand). That is essentially my proposal above. With this set of changes loadSession() is not required, and it removes the issue of figuring out what is in a “session” and what is persisted or cleared when a “session” is released.

> 
> So, here is an off-the-wall idea for how to encapsulate the different existing DRM approaches behind a common API. This idea may not work, but then it may inspire others: Suppose we remove loadSession() and sessionID and instead optionally allow the MediaKeySession object to support structured clone. These objects could then be stored in IndexedDB for persistence. In implementations that support explicit management of sessions by sessionId, the data backing the stored MediaKeySession would basically be the sessionId and when you retrieved the object from IndexedDB the UA would interact with the CDM in the same way that loadSession() does. On an implementation which did not support explicit management of sessions on the CDM interface, the stored MediaKeySession would be backed by the initData and retrieving the object from IndexedDB would amount to interacting with the CDM in the same way that createSession() does.

[steele] I think that applications should not be managing “sessions”. I believe that sessions should be ephemeral constructs.

> 
> I'm short of a proposal as to how the second kind of implementation would handle secure proof of key release.
> 
> Comments ?
> 
> ...Mark
> 
> 
> 
> 
>  
> 
> If the above was implemented - I believe there would be no need for loadSession() nor for any concept of “persistent sessions”. Applications could track their outstanding licenses by retaining the initData used to request them among with any other data desired. 
> 
> 
>> EME currently has sessionType and loadSession(sessionId), which have already been used to implement #1 and appear to address Mark's key release model. #2 and #3 might be more naturally implemented using some type of key repository, but I don't believe that is consistent with the EME design and the other application models it supports.
>> 
>> David
>> 
>> On Mon, Aug 18, 2014 at 3:00 PM, Joe Steele <steele@adobe.com> wrote:
>> In the last telco I was asked to provide some use cases for persistent keys. My earlier comments were entangled with other issues, so I am restating here rather than linking to them. 
>> 
>> Here are the most common use cases we see for persistent licenses we run into:
>> 
>> 1) Offline playback
>> Customer wants to download a movie and acquire licenses for playback offline, typically while traveling. In this case a license is issued for that user on that particular device with a limited lifetime. Possibly with a limit on the number of active licenses as well.
>> 
>> 2) Accelerated playback of live streams
>> Customer wants to be able to playback a live stream (e.g. a sporting event) without having to acquire license right then. In this case the user periodically authenticates to acquire a limited lifetime license (e.g. 24 hours) which allows them to playback content anytime they come back to that page until the license expires. 
>> 
>> 3) Reduced license server load
>> Content publisher wants to lower their cost of deployment for license servers. They can do this by issuing a bundle of licenses  with limited lifetimes for content which the user _might_ play. This results in fewer trips to the license server, since when a title is selected the user may already have the required license available.  
>> 
>> I will point out that very often 2 and 3 are related. For example during popular sporting events, a large number of people will try to get a license to view that event moments before it starts. Not only can that make acquiring the license take longer for each individual customer, the “license storm” of simultaneous requests can take down the server.
>> 
>> Joe
>> 
>> On Aug 4, 2014, at 10:37 PM, Joe Steele <steele@adobe.com> wrote:
>> 
>>> I would like feedback from the rest of the group on whether they are or will be loading persistent keys using loadSession the way the spec describes.
>>> 
>>> more replies inline — 
>>> 
>>> Joe
>>> 
>>> On Jul 31, 2014, at 1:10 PM, David Dorwin <ddorwin@google.com> wrote:
>>> 
>>>> The current design is based on many discussions and consideration of a variety of use cases that involve persisting data related to sessions. It should provide a single solution for all such use cases. We were aware that the license policy determines whether keys may be persisted and that DRM system have traditionally relied on this alone. Such systems are supported by the current normative algorithms.
>>>> 
>>>> The group can work on alternative interoperable solutions that address these use cases, but they need to meet the same requirements.
>>> 
>>> Let’s spell out those requirements. Given that a new one just came out (i.e. that applications need to be able to associate local information to send along with the key release receipts), I think there is a good chance there is still learning to be done here. 
>>> 
>>> As examples of persistent key requirements not being met — 
>>> * Allowing the application to indicate that the CDM should NOT use existing content keys and rather acquire new ones
>>> * Allowing the application to indicate that the CDM should use ONLY existing content keys and not acquire new ones
>>> * Allowing the application to indicate the that CDM may use existing content keys or acquire new ones as needed (this could be construed as the default now).
>>> * Allow the application to request a “blanket key release” that indicates the CDM does not have any content keys in its possession
>>> 
>>>> I think it’s premature to assume that loadSession() is the wrong solution, especially since there is no complete proposal that addresses the main (key persistence) use case in a well-defined interoperable way.
>>> 
>>> I don’t assume it is the wrong solution. But it does not seem to be a good fit for the way my CDM persists keys and my customers use cached keys. 
>>> 
>>> It seems like you are saying that allowing the CDM to choose when to persist keys and load persisted keys in createSession would be a problem for interoperability. I believe that it is easier for applications to not always have to track what keys the CDM may already have access to. Some applications want that level of control (e.g. for key release) but some do not. I do not see this as an interop problem. Can you explain what you see as a problem?
>>> 
>>>> 
>>>> Specific comments inline.
>>>> 
>>>> On Mon, Jul 28, 2014 at 2:31 PM, Joe Steele <steele@adobe.com> wrote:
>>>> I would like to summarize this thread so far.
>>>> 
>>>> There seems to be general agreement that loadSession() is intended to solve the problem of sending key release messages when a session was previously closed without sending them. The reasoning for persisting the original session is to allow the application to store information based on the original session ID and send it with the key release message when it is eventually sent. Open question — Is there a better way to accomplish this with less side effects?
>>>> 
>>>> That is not an accurate statement of the intent. Persistent sessions and loadSession() were designed to address "Various use cases involving loading data from storage", which includes both the offline and the key release use case and potentially others. It avoids adding one-offs for each such use case. If we were to decide that persistent sessions/licenses no longer use this design, we'd need to rethink key release. Preferably, we can solve them together.
>>>> 
>>>> Can you clarify what you mean by "side effects”?
>>> 
>>> The idea of what is in a session and how it is handled by the CDM is still too fuzzy:
>>> * We know they MAY have key release receipts, assuming the CDM supports them. 
>>> * We know they MAY have keys
>>> * They may contain other data as well (status, policies, etc.). 
>>> * When the application asks for the session to “persist” this may or may not mean anything to the CDM
>>> * If CDM does persist the session, it may or may not contain any keys. 
>>> * When remove() is called on a session, the exact data that is removed (aside from keys) is up to the CDM
>>> 
>>> I was not opposed to the definition of a session being fuzzy when it was an ephemeral construct like a TLS session. I think that is a natural outcome of the CDM and key server being out of scope. But this fuzziness means that expectations around CDM behavior when someone calls loadSession() or remove() are unlikely to be met across all or even most CDMs.
>>> 
>>>>  
>>>> 
>>>> There does not seem to be agreement that loadSession() is well suited to solving the problem of loading persistent keys. The proposed alternative is to load persistent keys on createSession() as the CDM sees fit and provide an option to createSession() that forbids the use of persistent keys. 
>>>> 
>>>> Does the proposed alternative assume that we continue to use the "persistent" sessionType in createSession()? How are these licenses/keys later removed?
>>> 
>>> If we keep the loadSession method, then having “persistent” continues to make sense as a signal to the CDM that the application may be retaining information about this session. If not — I would eliminate it as being unclear. 
>>> 
>>> I would expect that CDMs will naturally remove keys/license if they expire. We could provide an explicit mechanism for clearing data (e.g. MediaKeys.clearData). I am not sure applications would use such a mechanism, but we could provide it.
>>> 
>>>> 
>>>> There are still unanswered questions around how to load sessions. "As the CDM sees fit" is antithetical to interoperability, and such underspecification may gloss over problems like initData ambiguity.
>>> 
>>> The details of key acquisition are deliberately left to be defined between the CDM and license server. This is one of the strengths of the standard and I thought was a great compromise. I am not sure what you are referring to when you say “initData ambiguity”. Would you mind clarifying? 
>>> 
>>> 
>>>> 
>>>> Joe
>>>> 
>>>> 
>>>> 
>>>> On Mon, Jul 28, 2014 at 1:57 PM, Joe Steele <steele@adobe.com> wrote:
>>>> 
>>>> On Jul 28, 2014, at 1:36 PM, Mark Watson <watsonm@netflix.com> wrote:
>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> On Mon, Jul 28, 2014 at 1:34 PM, Joe Steele <steele@adobe.com> wrote:
>>>>> 
>>>>> On Jul 28, 2014, at 9:44 AM, Mark Watson <watsonm@netflix.com> wrote:
>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Mon, Jul 28, 2014 at 9:26 AM, Joe Steele <steele@adobe.com> wrote:
>>>>>> On Jul 24, 2014, at 3:39 PM, David Dorwin <ddorwin@google.com> wrote:
>>>>>> 
>>>>>>> On Thu, Jul 24, 2014 at 3:13 PM, David Dorwin <ddorwin@google.com> wrote:
>>>>>>> 
>>>>>>> On Thu, Jul 24, 2014 at 1:03 PM, Jerry Smith (WINDOWS) <jdsmith@microsoft.com> wrote:
>>>>>>> Using loadSession to confirm key removal seems like it would be intended to support temporary sessions, not persisted ones.  Is not loadSession limited to reloading stored session data?
>>>>>>> 
>>>>>>> It's important not to confuse "session" and "session data" with "license" or "key(s)." Since the confirmation process can fail, a key removal-based solution requires the ability to persist the confirmation ("session data"). Thus, it must be possible to later load it (i.e. after the device has powered off). Whether the license allows the key information to be persisted is orthogonal to session persistence.
>>>>>>> I don’t see advantage for using loadSession for the persisted license case.  It requires the app to keep a record of all previously stored sessionIDs and presents a risk of sessions being orphaned and never subsequently reused or removed.  Is there a use case that suggests that stored persisted licenses should not be automatically reused?  And if there is, might it not be equally fulfilled with an attribute on createSession that disallows using a persisted license?
>>>>>>> 
>>>>>>> I think the biggest problem with reuse has been requests for persistence and reuse to be invisible to the application. I think your proposals help address that. I'm still concerned about lookup based on initData - see all the discussions about identifying duplicate initData to avoid creating duplicate sessions (in the "temporary" case) - and the ability to uniquely identify a session in the event that there might be two sessions for the same initData (the current spec prevents multiple sessions with the same ID).
>>>>>> 
>>>>>> [steele] The de-duplication of initData is a concern, but loadSession puts the burden for this onto the application which has less information about the relationship between different chunks of initData than the CDM does. 
>>>> 
>>>> Yes, the application is responsible for this, but it seems reasonable. The application is also responsible for storing cookies, etc.
>>>> 
>>>> How common is it for applications to want to persist licenses/content decryption keys?
>>>>>> 
>>>>>> 
>>>>>>> Also, it's nice to have a single solution for loading persisted sessions. I suppose the key release model could store the initData with the sessionId and use the former to retrieve the confirmation. However, you could imagine a scenario where the same movie is being played and you get the wrong session. Perhaps that can be solved by never loading the same session twice at the same time, but it's another example of the issues with using initData as an identifier.
>>>>>> 
>>>>>> [steele] I don’t believe the concept of a “persisted session” is useful. The only argument for it seems to be the "failure to release key(s)” use case. Let me restate the question I asked at the beginning of this thread — Is there any use case in which applications would NOT want these messages to be sent? I have not seen anyone say “yes" to this question yet. If the answer is "no", then the question of which key release is tied to which session is moot.
>>>> 
>>>> That may be the only argument for the key release case, but there are also arguments related to persisting keys - storing a key or license is essentially persisting part of the session. Regardless, of whether the key is loaded with something like loadSession(), "persisted session" still seems useful. As previously discussed, EME is designed around sessions, so any operations should be related to sessions.
>>>>>> 
>>>>>> It's not moot if the client application needs to associate the key release message with some kind of application-specific session identifier.
>>>>> 
>>>>> [steele] Great! This is exactly the type of detail I was looking for. So the application may need to provide a session-specific identifier which is persistent like the key release message. Let’s add that information to the use case wiki as well.  So the real problem here is — the application needs to track some information parallel to the keys being acquired, which it can send along when the keys are released. Is that about right? Are there more requirements on this transaction?
>>>>> 
>>>>> That's right. I can't think of additional requirements. The sessionId was a mechanism to achieve this.
>>>> 
>>>> To further clarify, the information does not need to be in a CDM message, but the application (or server) wants to be able to associate the message from the CDM with this information. I've tried to clarify this in your new wiki text.
>>>>  
>>>> 
>>>> [steele] Ok. I added this to the wiki. Now we have to decide if this is the best way to accomplish this use case. 
>>>> 
>>>> There is a handshake involved, so remove() must only come after the receipt has been ack'd. I fixed the wiki to reflect this.
>>>> 
>>>> It would be better in my opinion to have a mechanism without as many side effects. What is the nature of the information being added? Could the application simply accumulate all of the outstanding “tokens” and add them all to the next outgoing key release message? This would require the server to have a way to match receipts to “tokens” but presumably that is completely under the application providers control.
>>>> 
>>>> Good question, but you still need a way to *later* instantiate a MediaKeySession object at which to fire the "message" event. This is the Failed Handshake case in the key release section of the wiki.
>>>> 
>>>> Note that with loadSession(), there is a single model for loading persisted data into a MediaKeySession object and there is no need to figure out a new solution for each unique use case of persisted data from or related to a session.
>>>>  
>>>>>>  
>>>>>> Any unsent key releases should be sent as soon as feasible. This also makes the question of which sessions are persistent moot. The sessions that are persistent are those that contain unsent key release messages. The CDM can figure out which those are without the help of the application.
>>>> 
>>>> 
>>>> In general, the application should choose to persist data. We wouldn't want the client (UA/CDM) to silently store a cookie that the application didn't request.
>>>> 
>>>> In the key release case specifically, the CDM must always persist data for such session in order to support Failed Handshake. The application would choose to remove the persisted data if the handshake is successful.
>>>>>> 
>>>>>>> Another issue is that the same title may not always use the exact same initData, even in the same file. If we rely on using initData to look up sessions, it may not always work. This also presents problems if you want to use initData from, for example, the audio stream to find a license that was created for the initData from the video stream.
>>>>>> 
>>>>>> [steele] This is yet another reason to have sessions be an ephemeral construct so there is never any need to look them up.
>>>>>> 
>>>> 
>>>> If the session is "ephemeral", it should not change the persisted state of the client (unless some sort of "write" method is called).
>>>>  
>>> 
>> 
>> 
> 
>
Received on Wednesday, 20 August 2014 22:46:43 UTC