[Bug 20965] EME results in a loss of control over security and privacy. from bugzilla@jessica.w3.org on 2013-02-25 (public-html-bugzilla@w3.org from February 2013)

From: <bugzilla@jessica.w3.org>
Date: Mon, 25 Feb 2013 10:54:24 +0000
To: public-html-bugzilla@w3.org
Message-ID: <bug-20965-2486-mdDF1Oe1Ly@http.www.w3.org/Bugs/Public/>
https://www.w3.org/Bugs/Public/show_bug.cgi?id=20965

--- Comment #22 from Henri Sivonen <hsivonen@iki.fi> ---
(In reply to comment #14)
> Might it have been for revoking keys?

Surely the revocation of content keys will be based on communicating expiry
time together with the keys so that the CDM will consider them revoked when
they expire and the revocation of CDM keys will happen on the server-side so
that the server will refuse to send content keys over to a CDM whose keys
appear on a CRL.

> Some of the uses cases do appear to require the CDM to have privileged
> storage not accessible or modifiable by the UA or user.

What use cases would not be addressed if the storage was browser-mediated and
the CDM encrypted the data that it asks the browser to store in the context of
a given origin? (That is, the browser would store encrypted blobs keyed by
origin and could detect that some data has been stored for a given origin and
the browser could delete that data.)

(In reply to comment #15)
> I see four separate privacy/tracking issues with identifiers:
> 1) The initial message (or part of it) for a dummy file may effectively form
> an identifier that any site* could use for tracking over time
> 2) The initial message (or part of it) for a dummy file may effectively form
> an identifier that any site* could use for tracking across sites, if those
> sites collaborate
> 3) An identifier available to the server side of the keysystem may be used
> for tracking over time by a single site
> 4) An identifier available to the server side of the keysystem may be used
> for tracking across sites, if those sites collaborate
> 
> * including sites which do not support the server side of any keysystem
> 
> Tracking across sites (2, 4) can be addressed if the identifier is
> origin-specific i.e. if netflix.com sees a different identifier to hulu.com.

Yes. EME should say this.

> Tracking by arbitrary sites (1, 2) can be addressed if the initial message
> is not consistent. For example if it is encrypted with a keysystem public
> key, and contains information which changes every time a message is
> generated (salt, nonce, timestamp) etc.

Yes, the case where the tracking server doesn't implement enough of the key
system to locate a public key advertised by the CDM or the try verifying a
signature generated with a private key is easy to address by salting.

> That leaves (3), where the considerations are very different depending on
> whether the UA can cause the identifier to be reset. If it can, then the
> situation is hardly different from cookies today.

Right.

> I also don't think we should prescribe what UAs should do (W3C
> specifications don't generally mandate specific privacy dialogs etc.).

The most relevant recent precedent is:
www.w3.org/TR/geolocation-API/#privacy_for_uas

Furthermore, the experience that informed the design of the geolocation API
says that the points in the flow where user authorization may need to be
checked have to be points where at the API flow is asynchronous. It seems that
the relevant points in the EME flow are already asynchronous, but it would
still be good to point out explicitly what the points where user authorization
may need to be checked are.

> Regarding persistently stored information, there is one use-case in the
> specification: secure proof of key release.

I think key release still needs more clarification in EME, but that's another
bug.

> This requires the CDM to
> persistently store session identifiers - but not the licenses or keys - for
> MediaKeySessions that previously existed, until receipt of the key release
> information by the server is acknowledged.

Isn't it enough for the CDM to generate a signed message at the time of
destroying the keys and for the browser to be responsible for storing this
message and re-transmitting it until it has been acknowledged? Even if the
browser manages the storage, making a server that intentionally defers the
acknowledgment of key release messages would allow for cookie-like tracking
functionality.

(In reply to comment #17)
> (In reply to comment #13)
> > (In reply to comment #10)
> > I'm pretty sure you'll find that browser vendors treat the issue of
> > "globally unique persistent identifier exposed to all sites" as an issue for
> > all modes of operation, not just "private" mode issue.
> 
> Agreed. However that is not required by EME.

I think EME should address privacy concerns that would arise from CDM designs
that can be realistically expected considering existing DRM systems even if the
design decisions that result in privacy concerns are not required by EME.

> My point was about the
> persistence of unique identifiers, not how global they are. I am *not*
> arguing for the existence of a globally unique persistent identifier exposed
> to all sites, nor is it required for CDMs (at least not the one I am most
> familiar with)

Can you, please, elaborate on that? The Adobe Access 4 Overview document links
to from comment 5 says:
"The Flash Player or Adobe AIR runtime client acquires a unique digital
certificate (called a machine certificate)
from an Adobe-hosted server.

This process of assigning a unique certificate is called individualization.
Individualization uniquely identifies both
the computer and the Flash Player or Adobe AIR runtime used to playback
content.

The individualization process allows the downloaded licenses to be bound to a
specific computer on which the
client is installed. Every computer is given a unique machine credential
(machine private key and machine
certificate). If a specific client were to become compromised, it can be
revoked and barred from acquiring licenses
for new content."

> > What's your use case of persistent storage of CDM-related information? I
> > thought it wasn't worthwhile to propose more complex requirements without
> > knowing the use cases that the requirements were supposed to address.
> 
> In cases where a license can have a longer lifetime than a single session,
> it is useful (and sometimes necessary) to not require the user to reacquire
> the license the next time they want to play. 
> 
> Here are some of the benefits:
> * Allows the license provider to lower their cost (less network transactions
> required) which can result in lower costs for the user.

This seems like a wrong optimization. The network transactions for
re-contacting the license server are tiny compared to the network transactions
involved in the transfer of the media itself and even in the transfer of the
HTML, CSS and JavaScript around the media.

> * Allows the user to request a license in a secure environment and then
> continue to play back content when they are in an insecure environment
> without having to reacquire the license over the insecure network. 

We have https for secure transactions over an insecure network.

> * Reduces the number of times the user needs to authenticate.

Can you elaborate on this? In a Netflix-like case, you need to login to resume
an interrupted movie anyway. On a site similar to thedailyshow.com, there is no
user-facing authentication in the first place.

> > In any case, persistent storage of licenses gives a person with access to
> > the computing device information about what sites have been accessed.
> 
> This is dependent on how the information is secured on disk. The browser
> cache seems like a more likely target for snooping though, since the
> location you downloaded the movie from is probably much more informative. If
> I have local access to the computing device I can gather information on the
> user in any number of ways. 
> 
> Or is your point that the user can get access to the list when the DRM
> vendor might not want them to?

My point is that if the CDM manages its own storage, there can be snoopable
data left there after of browser function to wipe browser-managed storage, such
as the HTTP cache, has been used. To remedy this, persistent storage for the
CDM, if needed at all, should be browser-mediated.

(In reply to comment #19)
> My understanding was that EME was a UA interface to the non-UA-CDM and
> that the CDM had privileges above and beyond the UA, and thus the UA
> has little opportunity to protect the user. 

If there is authorization by the user before CDM gets to send a message to the
site or the browser vendor has had the opportunity to ensure that the messages
generated by the CDM are not privacy sensitive, there are a couple of plausible
designs that would give the browser the opportunity to protect the user:

Software-only case (plausible for SD; decoded frame data exposed to the
browser):
The browser sandboxes the CDM into a separate process that can only perform
memory allocation, computation or talk with the browser process. Encrypted
media, EME messages and seek times go into the CDM. EME messages, pixels and
audio samples come out of the CDM.

Hardware CDM case (plausible for HD; decoded frame data are not exposed to the
browser):
The decryption and decompression function is performed by a discrete hardware
component. The browser talks to this hardware component through an open-source
driver. The browser vendor examines the hardware to be convinced that the
hardware component cannot do IO except writing pixels to the GPU are talking
through the interface that the driver exposes. Encrypted media, EME messages
and seek times go through the driver into the hardware component. EME messages,
references to frame data in GPU memory and audio samples come from the hardware
component through the driver. The hardware component outputs pixels onto
surfaces in the GPU memory that are marked as readback-disabled. The GPU
hardware ensures that the surfaces marked as readback-disabled cannot be read
back into software even though the software can designate where the GPU uses
the surfaces.

Of course, it's easy to come up with designs that don't make it possible for
the browser to protect the user in cases where the user doesn't trust the CDM
(e.g. having CDMs run the way NPAPI plug-ins run).

> The relationship between
> the UA and the CDM needs to be clarified.

I agree.

> Does EME even support the UA identifying the EME in a secure way
> that the privileged CDM can not spoof?  If not then the UA has
> absolutely not control.

EME does not specify the API between the UA and the CDM. It would be possible
to specify that API in such a way that the UA could authenticate the CDM on the
same level of confidence that content providers can authenticate the CDM. As
the first communication between the UA and the CDM, the UA could randomly
generate a nonce, handed to the CDM and ask the CDM to encrypt it using the
CDM's private key and hand the result back. The UA could then decrypt the
results using the CDM's public key and compare the results with the original
nonce.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
Received on Monday, 25 February 2013 10:54:30 UTC