Re: [SUMMARY] Do we need loadSession? from Mark Watson on 2014-08-19 (public-html-media@w3.org from August 2014)

From: Mark Watson <watsonm@netflix.com>
Date: Tue, 19 Aug 2014 16:58:24 -0700
To: Joe Steele <steele@adobe.com>
Cc: David Dorwin <ddorwin@google.com>, "<public-html-media@w3.org>" <public-html-media@w3.org>, "Jerry Smith (WINDOWS)" <jdsmith@microsoft.com>
Message-ID: <CAEnTvdBPos9ss1YtrnFmrnR=ga7VC56M33K5Yg9kb6TrteJjcA@mail.gmail.com>
On Tue, Aug 19, 2014 at 3:26 PM, Joe Steele <steele@adobe.com> wrote:

> Inline —
>
> Joe
>
> On Aug 18, 2014, at 6:50 PM, David Dorwin <ddorwin@google.com> wrote:
>
> Thanks, Joe. This is helpful.
>
> In all of these cases, it seems the application is involved in making this
> decision. In #1, the user (via the app) "pins" the content. In #2 (not
> combined with #3), the app may proactively acquire licenses on the user's
> behalf. In #3, the app could proactively request a license containing all
> the keys.
>
>
> For #1 you are correct, this is typically something driven by the
> application, since only the user knows when an offline period is imminent.
> The exception is when apps do this all the time.
>
> For #2 and #3 the application is typically not involved in the decision at
> all. The content publisher knows the characteristics of the content and the
> expected server load, and makes the decision about how to issue keys for
> this content before the application is ever launched.
>
>
> From an EME POV, #3 is probably most naturally accomplished by requesting
> separate licenses for each title, but I can see how that might be
> counterproductive to the goal of reducing license server load. Creating a
> session with initData containing key IDs (as instructed by the page/server)
> for each title might reduce server load (in terms of raw number of
> requests). A similar effect could be achieved on the license server, but
> that seems like the wrong place, especially if the service wants to
> optimize across clients.
>
> If it is reasonable for the application to request (or at least be aware
> that it is requesting) persisted licenses, the questions are:
>
>    1. How should the application make the request?
>
> I think a signal in the createSession() call that tells the CDM that the
> application will allow (or not allow) persistent licenses would work well.
> That would allow for the license server to provide persistent licenses and
> the CDM to make the call as to whether to persist them or not based on
> license server policy and input from the application.
>
>
>    1. How should the application later retrieve (and eventually remove)
>    such licenses?
>
> I think the natural way for applications to use licenses (not sure what
> you mean by retrieve) is for the application to provide the metadata for
> the content it is trying to play and the CDM to make the determination of
> whether it has the licenses on hand or needs to request new ones. It would
> be good to also allow the application to constrain the CDM via
> createSession params to only using licenses on hand or only requesting new
> licenses.
>
> I think the natural way for applications to remove licenses in use during
> a session is to make a removeLicenses() call. This should have the effect
> of removing any licenses (persistent or otherwise) used for that content
> stream. In order to handle the case where the licenses were not properly
> removed (e.g. when a session was ended abruptly), the application can
> simply call createSession with a signal to the CDM not to request new
> licenses. Then call removeLicenses() on the new session - which should
> remove any licenses that were still valid.
>

I think I have seen multiple incompatible alternatives described as
"natural" in this debate. I'm not sure it's a very strong argument, as
what's probably meant is "natural consequence of a bunch of unstated
assumptions / expectations / objectives / implementation constraints". Or
at least, what's "natural" to one person may be "unnatural" to another.

On this question, like pretty much anything in this API, we are plotting a
course between two extremes: on the one hand we could have had an entirely
context-free, semantics-free, CDM-specific message passing API, rather like
what was done in the Hbb specs and similar. You'd need a bunch of
CDM-specific application code to do anything and no two CDMs would work the
same way. On the other hand, we could completely define CDM functionality
and behaviour, rigorously define what is in a license and the functions it
drives and only leave out the actual technology used to secure the protocol
and encode the messages.

I'd suggest that the only reason for not following the latter approach is a
desire to retain support for this effort from multiple UA / DRM vendors.
That is, to accommodate existing DRMs without significant modification.

The specific question at hand is the extent to which we explicitly model
the handling of persisted information - both licenses themselves and the
evidence of key removal: should that aspect of CDM functionality be opaque
to the application, happening automatically as a result of CDM interaction
with its server peer, or should it be explicitly visible to and controlled
by the application (of course with CDM policing things) ?

[Note that we may have a disconnnect as to what we mean by "application",
since I was completely thrown by a reference to things happening 'before
the application is even launched'. I don't see how that makes sense at all.
I consider the application to be launched when you visit netflix.com -
before this there is no CDM loaded at all.]

It seems to me that the application should be able to track or easily
determine what content it has / had keys for and what it does not. For
example, the application may want to pre-fetch keys for content items it
identifies as likely to be played, but that is not necessary if they keys
are already available. It would be nice to know this without async
requests, but if requests are required then they should return an answer
quickly in both cases. That suggests we should have a deterministic way to
discover whether or not a key exchange is necessary for a given piece of
content. But both models provide that: loadSession() will fail if the
session is not in fact still available and a new session created with
createSession() will either immediately indicate usable key ids, fire a
keymessage or fire a keyschange.

So, here is an off-the-wall idea for how to encapsulate the different
existing DRM approaches behind a common API. This idea may not work, but
then it may inspire others: Suppose we remove loadSession() and sessionID
and instead optionally allow the MediaKeySession object to support
structured clone. These objects could then be stored in IndexedDB for
persistence. In implementations that support explicit management of
sessions by sessionId, the data backing the stored MediaKeySession would
basically be the sessionId and when you retrieved the object from IndexedDB
the UA would interact with the CDM in the same way that loadSession() does.
On an implementation which did not support explicit management of sessions
on the CDM interface, the stored MediaKeySession would be backed by the
initData and retrieving the object from IndexedDB would amount to
interacting with the CDM in the same way that createSession() does.

I'm short of a proposal as to how the second kind of implementation would
handle secure proof of key release.

Comments ?

...Mark






>
> If the above was implemented - I believe there would be no need for
> loadSession() nor for any concept of “persistent sessions”. Applications
> could track their outstanding licenses by retaining the initData used to
> request them among with any other data desired.
>
>
> EME currently has sessionType and loadSession(sessionId), which have
> already been used to implement #1 and appear to address Mark's key release
> model. #2 and #3 might be more naturally implemented using some type of key
> repository, but I don't believe that is consistent with the EME design and
> the other application models it supports.
>
> David
>
> On Mon, Aug 18, 2014 at 3:00 PM, Joe Steele <steele@adobe.com> wrote:
>
>> In the last telco I was asked to provide some use cases for persistent
>> keys. My earlier comments were entangled with other issues, so I am
>> restating here rather than linking to them.
>>
>> Here are the most common use cases we see for persistent licenses we run
>> into:
>>
>> 1) Offline playback
>> Customer wants to download a movie and acquire licenses for playback
>> offline, typically while traveling. In this case a license is issued for
>> that user on that particular device with a limited lifetime. Possibly with
>> a limit on the number of active licenses as well.
>>
>> 2) Accelerated playback of live streams
>> Customer wants to be able to playback a live stream (e.g. a sporting
>> event) without having to acquire license right then. In this case the user
>> periodically authenticates to acquire a limited lifetime license (e.g. 24
>> hours) which allows them to playback content anytime they come back to that
>> page until the license expires.
>>
>> 3) Reduced license server load
>> Content publisher wants to lower their cost of deployment for license
>> servers. They can do this by issuing a bundle of licenses  with limited
>> lifetimes for content which the user _might_ play. This results in fewer
>> trips to the license server, since when a title is selected the user may
>> already have the required license available.
>>
>> I will point out that very often 2 and 3 are related. For example during
>> popular sporting events, a large number of people will try to get a license
>> to view that event moments before it starts. Not only can that make
>> acquiring the license take longer for each individual customer, the
>> “license storm” of simultaneous requests can take down the server.
>>
>> Joe
>>
>> On Aug 4, 2014, at 10:37 PM, Joe Steele <steele@adobe.com> wrote:
>>
>> I would like feedback from the rest of the group on whether they are or
>> will be loading persistent keys using loadSession the way the spec
>> describes.
>>
>> more replies inline —
>>
>> Joe
>>
>> On Jul 31, 2014, at 1:10 PM, David Dorwin <ddorwin@google.com> wrote:
>>
>> The current design is based on many discussions and consideration of a
>> variety of use cases that involve persisting data related to sessions. It
>> should provide a single solution for all such use cases. We were aware that
>> the license policy determines whether keys may be persisted and that DRM
>> system have traditionally relied on this alone. Such systems are supported
>> by the current normative algorithms.
>>
>>
>> The group can work on alternative interoperable solutions that address
>> these use cases, but they need to meet the same requirements.
>>
>>
>> Let’s spell out those requirements. Given that a new one just came out
>> (i.e. that applications need to be able to associate local information to
>> send along with the key release receipts), I think there is a good chance
>> there is still learning to be done here.
>>
>> As examples of persistent key requirements not being met —
>> * Allowing the application to indicate that the CDM should NOT use
>> existing content keys and rather acquire new ones
>> * Allowing the application to indicate that the CDM should use ONLY
>> existing content keys and not acquire new ones
>> * Allowing the application to indicate the that CDM may use existing
>> content keys or acquire new ones as needed (this could be construed as the
>> default now).
>> * Allow the application to request a “blanket key release” that indicates
>> the CDM does not have any content keys in its possession
>>
>> I think it’s premature to assume that loadSession() is the wrong
>> solution, especially since there is no complete proposal that addresses the
>> main (key persistence) use case in a well-defined interoperable way.
>>
>>
>> I don’t assume it is the wrong solution. But it does not seem to be a
>> good fit for the way my CDM persists keys and my customers use cached keys.
>>
>> It seems like you are saying that allowing the CDM to choose when to
>> persist keys and load persisted keys in createSession would be a problem
>> for interoperability. I believe that it is easier for applications to not
>> always have to track what keys the CDM may already have access to. Some
>> applications want that level of control (e.g. for key release) but some do
>> not. I do not see this as an interop problem. Can you explain what you see
>> as a problem?
>>
>>
>> Specific comments inline.
>>
>> On Mon, Jul 28, 2014 at 2:31 PM, Joe Steele <steele@adobe.com> wrote:
>>
>>> I would like to summarize this thread so far.
>>>
>>> There seems to be general agreement that loadSession() is intended to
>>> solve the problem of sending key release messages when a session was
>>> previously closed without sending them. The reasoning for persisting the
>>> original session is to allow the application to store information based on
>>> the original session ID and send it with the key release message when it is
>>> eventually sent. Open question — *Is there a better way to accomplish
>>> this with less side effects?*
>>>
>>
>> That is not an accurate statement of the intent. Persistent sessions and
>> loadSession() were designed to address "Various use cases involving
>> loading data from storage"
>> <https://www.w3.org/Bugs/Public/show_bug.cgi?id=23955>, which includes
>> both the offline and the key release use case and potentially others. It
>> avoids adding one-offs for each such use case. If we were to decide that
>> persistent sessions/licenses no longer use this design, we'd need to
>> rethink key release. Preferably, we can solve them together.
>>
>>
>> Can you clarify what you mean by "side effects”?
>>
>>
>> The idea of what is in a session and how it is handled by the CDM is
>> still too fuzzy:
>> * We know they MAY have key release receipts, assuming the CDM supports
>> them.
>> * We know they MAY have keys
>> * They may contain other data as well (status, policies, etc.).
>> * When the application asks for the session to “persist” this may or may
>> not mean anything to the CDM
>> * If CDM does persist the session, it may or may not contain any keys.
>> * When remove() is called on a session, the exact data that is removed
>> (aside from keys) is up to the CDM
>>
>> I was not opposed to the definition of a session being fuzzy when it was
>> an ephemeral construct like a TLS session. I think that is a natural
>> outcome of the CDM and key server being out of scope. But this fuzziness
>> means that expectations around CDM behavior when someone calls
>> loadSession() or remove() are unlikely to be met across all or even most
>> CDMs.
>>
>>
>>
>>>
>>> There does not seem to be agreement that loadSession() is well suited to
>>> solving the problem of loading persistent keys. The proposed alternative is
>>> to load persistent keys on createSession() as the CDM sees fit and provide
>>> an option to createSession() that forbids the use of persistent keys.
>>>
>>
>> Does the proposed alternative assume that we continue to use the
>> "persistent" sessionType in createSession()? How are these licenses/keys
>> later removed?
>>
>>
>> If we keep the loadSession method, then having “persistent” continues to
>> make sense as a signal to the CDM that the application may be retaining
>> information about this session. If not — I would eliminate it as being
>> unclear.
>>
>> I would expect that CDMs will naturally remove keys/license if they
>> expire. We could provide an explicit mechanism for clearing data (e.g.
>> MediaKeys.clearData). I am not sure applications would use such a
>> mechanism, but we could provide it.
>>
>>
>> There are still unanswered questions around how to load sessions. "As the
>> CDM sees fit" is antithetical to interoperability, and such
>> underspecification may gloss over problems like initData ambiguity.
>>
>>
>> The details of key acquisition are deliberately left to be defined
>> between the CDM and license server. This is one of the strengths of the
>> standard and I thought was a great compromise. I am not sure what you are
>> referring to when you say “initData ambiguity”. Would you mind clarifying?
>>
>>
>>
>>> Joe
>>>
>>
>>
>>
>> On Mon, Jul 28, 2014 at 1:57 PM, Joe Steele <steele@adobe.com> wrote:
>>
>>>
>>> On Jul 28, 2014, at 1:36 PM, Mark Watson <watsonm@netflix.com> wrote:
>>>
>>>
>>>
>>>
>>> On Mon, Jul 28, 2014 at 1:34 PM, Joe Steele <steele@adobe.com> wrote:
>>>
>>>>
>>>> On Jul 28, 2014, at 9:44 AM, Mark Watson <watsonm@netflix.com> wrote:
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Jul 28, 2014 at 9:26 AM, Joe Steele <steele@adobe.com> wrote:
>>>>
>>>>> On Jul 24, 2014, at 3:39 PM, David Dorwin <ddorwin@google.com> wrote:
>>>>>
>>>>> On Thu, Jul 24, 2014 at 3:13 PM, David Dorwin <ddorwin@google.com>
>>>>>  wrote:
>>>>>
>>>>>>
>>>>>> On Thu, Jul 24, 2014 at 1:03 PM, Jerry Smith (WINDOWS) <
>>>>>> jdsmith@microsoft.com> wrote:
>>>>>>
>>>>>>> Using loadSession to confirm key removal seems like it would be
>>>>>>> intended to support temporary sessions, not persisted ones.  Is not
>>>>>>> loadSession limited to reloading stored session data?
>>>>>>>
>>>>>> It's important not to confuse "session" and "session data" with
>>>>>> "license" or "key(s)." Since the confirmation process can fail, a key
>>>>>> removal-based solution requires the ability to persist the confirmation
>>>>>> ("session data"). Thus, it must be possible to later load it (i.e. after
>>>>>> the device has powered off). Whether the license allows the key information
>>>>>> to be persisted is orthogonal to session persistence.
>>>>>>
>>>>>>> I don’t see advantage for using loadSession for the persisted
>>>>>>> license case.  It requires the app to keep a record of all previously
>>>>>>> stored sessionIDs and presents a risk of sessions being orphaned and never
>>>>>>> subsequently reused or removed.  Is there a use case that suggests that
>>>>>>> stored persisted licenses should not be automatically reused?  And if there
>>>>>>> is, might it not be equally fulfilled with an attribute on createSession
>>>>>>> that disallows using a persisted license?
>>>>>>>
>>>>>> I think the biggest problem with reuse has been requests for
>>>>>> persistence and reuse to be invisible to the application. I think your
>>>>>> proposals help address that. I'm still concerned about lookup based on
>>>>>> initData - see all the discussions about identifying duplicate initData to
>>>>>> avoid creating duplicate sessions (in the "temporary" case) - and the
>>>>>> ability to uniquely identify a session in the event that there might be two
>>>>>> sessions for the same initData (the current spec prevents multiple sessions
>>>>>> with the same ID).
>>>>>>
>>>>>
>>>>> [steele] The de-duplication of initData is a concern, but loadSession
>>>>> puts the burden for this onto the application which has less information
>>>>> about the relationship between different chunks of initData than the CDM
>>>>> does.
>>>>>
>>>> Yes, the application is responsible for this, but it seems reasonable.
>> The application is also responsible for storing cookies, etc.
>>
>> How common is it for applications to want to persist licenses/*content decryption
>> *keys?
>>
>>>
>>>>>
>>>>> Also, it's nice to have a single solution for loading persisted
>>>>> sessions. I suppose the key release model could store the initData with the
>>>>> sessionId and use the former to retrieve the confirmation. However, you
>>>>> could imagine a scenario where the same movie is being played and you get
>>>>> the wrong session. Perhaps that can be solved by never loading the same
>>>>> session twice at the same time, but it's another example of the issues with
>>>>> using initData as an identifier.
>>>>>
>>>>>
>>>>> [steele] I don’t believe the concept of a “persisted session” is
>>>>> useful. The only argument for it seems to be the "failure to release
>>>>> key(s)” use case. Let me restate the question I asked at the beginning of
>>>>> this thread — *Is there any use case in which applications would NOT
>>>>> want these messages to be sent? *I have not seen anyone say “yes" to
>>>>> this question yet. If the answer is "no", then the question of which key
>>>>> release is tied to which session is moot.
>>>>>
>>>> That may be the only argument for the key release case, but there are
>> also arguments related to persisting keys - storing a key or license is
>> essentially persisting part of the session. Regardless, of whether the key
>> is loaded with something like loadSession(), "persisted session" still
>> seems useful. As previously discussed, EME is designed around sessions, so
>> any operations should be related to sessions.
>>
>>>
>>>> It's not moot if the client application needs to associate the key
>>>> release message with some kind of application-specific session identifier.
>>>>
>>>>
>>>> [steele] Great! This is exactly the type of detail I was looking for.
>>>> So the application may need to provide a session-specific identifier which
>>>> is persistent like the key release message. Let’s add that information to
>>>> the use case wiki as well.  So the real problem here is — the application
>>>> needs to track some information parallel to the keys being acquired, which
>>>> it can send along when the keys are released. Is that about right? Are
>>>> there more requirements on this transaction?
>>>>
>>>
>>> That's right. I can't think of additional requirements. The sessionId
>>> was a mechanism to achieve this.
>>>
>>> To further clarify, the information does not need to be in a CDM
>> message, but the application (or server) wants to be able to associate the
>> message from the CDM with this information. I've tried to clarify this in
>> your new wiki text.
>>
>>
>>>
>>> [steele] Ok. I added this to the wiki. Now we have to decide if this is
>>> the best way to accomplish this use case.
>>>
>>
>> There is a handshake involved, so remove() must only come after the
>> receipt has been ack'd. I fixed the wiki to reflect this.
>>
>>>
>>> It would be better in my opinion to have a mechanism without as many
>>> side effects. What is the nature of the information being added? Could the
>>> application simply accumulate all of the outstanding “tokens” and add them
>>> all to the next outgoing key release message? This would require the server
>>> to have a way to match receipts to “tokens” but presumably that is
>>> completely under the application providers control.
>>>
>>
>> Good question, but you still need a way to *later* instantiate a
>> MediaKeySession object at which to fire the "message" event. This is the Failed
>> Handshake
>> <https://www.w3.org/wiki/HTML/Media_Task_Force/EME_Use_Cases#Failed_Handshake>
>> case in the key release section of the wiki.
>>
>> Note that with loadSession(), there is a single model for loading
>> persisted data into a MediaKeySession object and there is no need to figure
>> out a new solution for each unique use case of persisted data from or
>> related to a session.
>>
>>
>>>
>>>>
>>>>> Any unsent key releases should be sent as soon as feasible. This also
>>>>> makes the question of which sessions are persistent moot. The sessions that
>>>>> are persistent are those that contain unsent key release messages. The CDM
>>>>> can figure out which those are without the help of the application.
>>>>>
>>>>
>> In general, the application should choose to persist data. We wouldn't
>> want the client (UA/CDM) to silently store a cookie that the application
>> didn't request.
>>
>> In the key release case specifically, the CDM must always persist data
>> for such session in order to support Failed Handshake. The application
>> would choose to remove the persisted data if the handshake is successful.
>>
>>>
>>>>> Another issue is that the same title may not always use the exact same
>>>>> initData, even in the same file. If we rely on using initData to look up
>>>>> sessions, it may not always work. This also presents problems if you want
>>>>> to use initData from, for example, the audio stream to find a license that
>>>>> was created for the initData from the video stream.
>>>>>
>>>>>
>>>>> [steele] This is yet another reason to have sessions be an ephemeral
>>>>> construct so there is never any need to look them up.
>>>>>
>>>>> If the session is "ephemeral", it should not change the persisted
>> state of the client (unless some sort of "write" method is called).
>>
>>
>>
>>
>>
>
>
Received on Tuesday, 19 August 2014 23:58:53 UTC