[Bug 21869] Need clarity on stored keys for CDMs from bugzilla@jessica.w3.org on 2013-05-23 (public-html-bugzilla@w3.org from May 2013)

From: <bugzilla@jessica.w3.org>
Date: Thu, 23 May 2013 18:17:15 +0000
To: public-html-bugzilla@w3.org
Message-ID: <bug-21869-2486-GrNEzPY8g7@http.www.w3.org/Bugs/Public/>
https://www.w3.org/Bugs/Public/show_bug.cgi?id=21869

--- Comment #4 from Joe Steele <steele@adobe.com> ---
(In reply to comment #3)
> (In reply to comment #2)
> > Retaining some keys allow for better performance and usability. Every key
> > acquisition has a cost, both to the user (representing either user time
> > spent or delay to first playback)
> 
> Isn't this best solved by letting the video start with a few seconds of
> unencrypted content even though the keys are declared up front so that
> playback can start during the key acquisition.

[steele] This would be a good solution for the initial startup delay. However
this is not in compliance with long term business agreements some content
publishers have for how their content is protected for distribution. In
practice I have found most publishers cannot use this technique currently. Also
this would not address the reacquisition problem after playback has started and
you are past the initial unencrypted portion. The same holds true for live
streams where the content server has no idea the this is the first time the
user is viewing the stream. 

> 
> > and to the server operator (key retrieval
> > on the back end).
> 
> Considering how network chatty Web apps are in general fetching images,
> doing XHR, etc., it seems weird to me to try to optimize away key
> reacquisition. One would expect the messages to be small in terms of the
> number of bytes, the crypto operations no heavier than a TLS handshake and
> key lookups not more advanced than database lookups that many sites do all
> the time.
> 
> Is there something intuition-defying about the cost of talking to a license
> server that would justify the avoidance of key reacquisition? I had expected
> key reacquisition would happen even more often than it really has to in
> order to implement heartbeat.

[steele] Remember there are two costs here. The cost on the client side and the
cost on the license server side. 

On the client side, the cost can be very expensive for platforms which use
obfuscated software to do the cryptographic operations. On the order of
seconds, which is immediately visible to the end user as lag. This is a bad
user experience. This also costs in terms of power/battery life as the client
potentially has to do a lot of crypto operations which will impact mobile
devices especially hard.

On the license server side, the cost of a TLS handshake is a good estimate
although that may be low for some protocols. That seems low, but consider the
"license storm" problem. A sporting event is about to start. 100M people fire
up their browsers and proceed to the website to watch the event. 

If no keys are cached -- all 100M send off a key request within a small window
(say 15min). To avoid server overload, the key server operator has a couple of
choices. They can opt to provision a large amount of servers all the time (lots
of expense) or use elastic scaling to ramp up as the load increases. This is a
balancing act for them, as even if they have a large amount of servers they may
guess wrong on the peak traffic and experience cascading failures. Elastic
scaling may not be fast enough to match the peak either, so users still
experience failures.  

Now consider if keys can be cached. The application can cache a "subscription
key" days ahead of time which is used to open the keys for the actual event.
The event streams can be encrypted with unique keys which are decrypt-able with
the "subscription key" and embedded in the content stream. In this case some
percentage of the 100M (potentially a large percentage) already have the keys
they need. There are far fewer license requests at the time of the event and
the storm is much smaller. 

> 
> > Some keys may be required again and again - for example if a group of
> > devices is associated with the same account, a common key can be used to
> > request content licenses for those devices. 
> 
> This seems to imply that DRM-level domains would be involved. Since the
> design of EME handles login using regular session cookies, it seems to me
> that the it makes no sense to for EME CDMs to have the concept of domains,
> since domains can be implemented entirely on the server side if desired.

[steele] EME does not need to have an explicit concept of domains. I was using
domains as an example of an intermediate key. In DRM systems which support a
hierarchy of keys and the keys are acquired in stages (for example from
different servers) it is useful to be able to retain the intermediate keys
between sessions to avoid the costs I outlined above where possible.

> 
> > Some keys may only be required for a particular piece of content, but you
> > don't want to have to pay the cost of acquisition again just because you
> > have put your machine to sleep briefly. 
> 
> Why is this unwanted? Web apps like Gmail ping a server all the time. OTOH,
> Netflix's player times out when paused even if the computer is awake.

[steele] As I mentioned above - this can introduce a delay of seconds or more
on the client end. 

> 
> > And in some cases it may not be convenient to reacquire the key, for example
> > if the key can only be acquired in a private environment but the content is
> > available to key holders in public environments.
> 
> I have trouble seeing what sort of movie streaming service would work like
> this.

[steele] Imagine a developer for a movie streaming service trying to debug
their player. They will need to test under various network conditions (say at
your local Starbucks). However they do not want to expose their pre-production
key servers to the open Internet. As a developer of these players - I run into
this issue on a daily basis. Or to give a different example, what about a
corporate video server streaming confidential content (e.g. a company meeting).
The key acquisition phase might have to be completed within the corporate
firewall but the playback could continue outside the firewall since the content
itself is protected.

Having said that -- that is not the main problem I am hoping to address. It is
the performance and usability issues I raised above. 

> 
> > Also there may be metadata about the keys themselves which needs to be
> > retained, for example how many times this piece of content has been played,
> > when the first playback started, etc. This can be maintained on the server,
> > but there can be a cost benefit to users and content providers to have this
> > local.
> 
> Seems like the client side could be simpler by handling this by the keys
> expiring often and the connection to the license server being chatty with
> re-requesting keys all the time as a form of heartbeat. If the server side
> doesn't like the complexity, maybe the server side should relax tracking
> requirements. 

[steele] See my above comments about why this could take a lot of client and
server side time. 

> 
> Considering privacy, it would be the best that the CDM didn't store anything
> persisently and, therefore, didn't create a new class of cookie-like data. I
> think adding a class of cookie-like data in order to optimize round trips to
> the key server is a bad tradeoff.
> 
> I would prefer EME banning CDMs from writing anything to persistent storage
> as a result of talking with a key server of a content service (in order to
> avoid the creation of a new class of cookie-like data). This formulation
> would still allow downloading an IBX from the DRM vendor (as opposed to a
> key server of a content site) and storing it as part of CDM setup.

[steele] If the main concern here is the adding of a separate class of
cookie-like data (as opposed to storing any data), I would say this is not a
firm requirement. However this has a clear benefit for the user, because it
make it less likely they will shoot themselves in the foot by clearing this
data inadvertently. 

I don't see the privacy concern here if this data is handled like web
application data is today, segregated by domain and subject to CORS
restrictions. Please articulate why you think this has privacy implications. 

I disagree on the trade-off. If the performance of players using EME is
necessarily less than that of existing plugin-based or app-based solutions,
this will be a roadblock to adoption and folks will continue to use the old
solutions. 

I think adding restrictions here (above what normal web app applications are
subject) would only result in a worse user experience and no additional
privacy.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
Received on Thursday, 23 May 2013 18:17:17 UTC