Re: Individualization from Joe Steele on 2014-10-28 (public-html-media@w3.org from October 2014)

From: Joe Steele <steele@adobe.com>
Date: Tue, 28 Oct 2014 03:45:34 +0000
To: David Dorwin <ddorwin@google.com>
CC: Henri Sivonen <hsivonen@hsivonen.fi>, "<public-html-media@w3.org>" <public-html-media@w3.org>
Message-ID: <5CD754B9-9674-473C-8A2F-32FE200BC150@adobe.com>
On Oct 24, 2014, at 6:03 PM, David Dorwin <ddorwin@google.com> wrote:
> 
> On Fri, Oct 24, 2014 at 5:05 PM, Joe Steele <steele@adobe.com> wrote:
> 
> On Oct 23, 2014, at 1:22 PM, David Dorwin <ddorwin@google.com> wrote:
>  
>> I am questioning how much it really improves privacy. As far as I can tell, this model just moves the privacy issue around. It might help if you can describe the use case in more detail and how it addresses the concerns I've described in [1] and [2]
> 
> In [1] you raise these concerns — 
> 
> * "There is a fundamental question of whether the user agent should turn over responsibility for platform initialization to an application.”  I understand this is your opinion, but you offer no reason why we should be concerned. It is not self-evident. 
> 
> The application and platform/user agent are clearly defined and separated in the web platform. Applications are very limited in what they can do, much less so than the user agent. The web platform also generally restricts applications to their own origin. I believe exposing user agent- or platform-wide initialization to an application is inconsistent with this. (There are similar concerns about platform-wide identifiers.)
> * If identifiers are provided, then the privacy properties are lost and you might as well use per-client initialization. 
> What privacy property do you believe is being lost here?
> To me, per-origin identifier means an identifier that is specific to and only known to the origin. If a central server provides the identifier, it is known to at least two origins/entities.
> If the central server receives some type of information unique to the client, the central server can correlate visits to multiple origins, which is something user agents generally try to avoid.
> 
> Consider these options:
> With a per-client identifier implementation, sites A and B could collude to track the user.
> With a truly per-origin identifier implementation, it is not possible for any sites to collude.
> With a per-origin identifier implementation that relies on a central server, the central server can track the user across sites A, B, C, etc.
> 
> #3 might actually be more concerning from a privacy perspective than #1.
> 
> * If the request (and identifiers) are forwarded to a central server, what was the point of going through the application? 
> The point of going through the application is to ensure that the keys one application uses are not shared with other applications.
> See above. I think this just trades one problem for another - potentially worse - one.

I think you need to define what you mean by “central server”. I might be happy with a normative restriction here, but I am unclear on where you could put this restriction. All of the potential privacy leaks you are concerned about could exist with any widely used backend service. It is the application that needs a restriction not to use a “central server”. And I don’t think that is within the purview of the spec (I could be wrong). 

> 
> * Also, the central server now has a record of all origins visited, which is a privacy concern.
> Since the application provider has a direct or indirect relationship with the company running the central server, no additional information is revealed, other than the application is actually being used. I think you may be conflating this with the possibility that device identifiers are being shared. I think the text is fairly clear there, although I filed a bug to clarify a bit more. I think where you are heading with this is that individualization servers should be required to comply with PII regulations. Which I believe is already the case (I know it is in our case). But I am not sure what you can say that is useful in the spec about that. I don’t believe you can have DRM without an exchange of PII. That is the nature of DRM. What you can do is regulate how that PII is exchanged (this is somewhat within the scope of the spec) and how is it handled by the recipients (completely outside the scope of this spec). 
> 
> You are correct, sites A, B, C, etc. could also provide this information to a central server without any changes to EME. The prospect of sharing such identifiers via the web platform has been raised as an area of concern with EME.

The prospect of sharing identifiers via the web platform is a general concern I agree. However this proposal does not make that any worse. 

A non-origin specific CDM identifier could be correlated with the user across multiple sites. As with any web application in multiple ways not involving EME. An origin-specific CDM identifier like that described in the privacy recommendation will only be shared with the creation server and the consuming server (they may be the same entity) unless the application explicitly decides otherwise. If those servers are the same or operated by the same company, it sounds like you have no concern. But if they are different and potentially operated by different people you do have a concern? I don’t think that is something you can restrict in the spec. Not making the change proposed in bug 27124 will in no way enforce that restriction. It will only make the application developers job more difficult. 

It sounds like you are not imposing a restriction on the CDM — you are imposing a restriction on the services architecture that the application providers can use. 

> 
> However, it is in scope to evaluate potential security or privacy risks when defining new APIs between the user agent and application. This is what we are currently doing.
> 
> 
> In [2] the concern raised seems to be "In all cases, implementations should avoid sending per-origin information to centralized servers since this could create a central record of all origins visited by a user or device."
> 
> My response is the same here as above. 
> 
> The privacy considerations provide guidance to user agent implementers to help them evaluate the risks and ideally provide information on how to mitigate them. This is especially true of risks that are not avoided or mitigated in the normative text.
> 
>> 
>> [1] https://www.w3.org/Bugs/Public/show_bug.cgi?id=27124#c5
>> [2] https://dvcs.w3.org/hg/html-media/raw-file/default/encrypted-media/encrypted-media.html#privacy-individualization
>>> 
>>> --
>>> Henri Sivonen
>>> hsivonen@hsivonen.fi
>>> https://hsivonen.fi/
>>> 
>> 
>> 
> 
>
Received on Tuesday, 28 October 2014 03:46:07 UTC