Re: Network Information API from Nicholas Doty on 2014-01-20 (public-device-apis@w3.org from January 2014)

From: Nicholas Doty <npdoty@w3.org>
Date: Mon, 20 Jan 2014 15:12:17 -0800
To: "SULLIVAN, BRYAN L" <bs3131@att.com>
Cc: Josh Soref <jsoref@blackberry.com>, "Frederick.Hirsch@nokia.com" <Frederick.Hirsch@nokia.com>, DAP <public-device-apis@w3.org>
Message-Id: <1E3D3B72-58A9-4BF0-AFEB-07966A5A26E2@w3.org>
On January 14, 2014, at 10:11 PM, SULLIVAN, BRYAN L <bs3131@att.com> wrote:

>> On Jan 14, 2014, at 5:51 PM, "Nicholas Doty" <npdoty@w3.org> wrote:
>> 
>> On January 12, 2014, at 11:00 PM, SULLIVAN, BRYAN L <bs3131@att.com> wrote:
>> 
>>>> (Nick wrote)
>>>>> I think there's a privacy concern in using the pattern of fired events, too. If we expect background access to these events (because your podcast web app needs to know whether it should stop downloading into localStorage or not), simultaneously firing an event across frames/tabs/windows allows for potentially unexpected correlation across different browsing contexts.
>>>> 
>>>> <bryan> Background (meaning any browser/window/tab not in the foreground) access to the events is desired. Many always-on app use cases will depend upon background operation, and these are many of the same (e.g. feed readers, email, SocNet) that would benefit from network-event-driven sync. But I don't know what you mean/imply by "simultaneously firing an event across frames/tabs/windows allows for potentially unexpected correlation across different browsing contexts". Can you explain this further, and associate it so some real/prevalent privacy attack? Such info would be good to capture on the wiki, if it ends up influencing the design of the API.
>>> 
>>> (Nick wrote)
>>>> I believe the concern is that the user may not expect that, for example, an iframe embedded in multiple different windows, can determine that it's the same user in those different browsing/application contexts. If I'm logged in to my social media accounts in one browser window and simultaneously have a private browsing window open which I'm using to research a medical issue, I would be unpleasantly surprised if my social media account is associated with my private browsing because my network adapter changed.
>>> 
>>> <bryan> Still not getting the issue. Can you explain further how a network adapter change (I guess you mean that there was a connection established on a different interface), even if fired as an event to distinct windows, can cause a correlation issue between those windows? How would the iframes determine that it's the same user, and correlate that info with info about their parent windows?
>> 
>> Apologies if I'm not being clear, and I'd welcome anyone else on the list who might understand and translate my comments more effectively.
>> 
>> How would the iframes determine that it's the same user:
>> 1. in one window I'm logged in to my social network and browse to a page which has an iframe to my social network to which I send my login cookie, e.g. user="NicholasDoty".
>> 2. in a second, private browsing window, I'm not logged in to my social network and don't send those cookies but browse to a page about medical information that embeds an iframe to that same social social network; the iframe assigns a new cookie, user="unknown1234".
>> 3. JavaScript in both iframes subscribes to events for the Network Information API.
>> 4. my device connects to a WiFi network, simultaneously triggering an event in all windows.
>> 5. JavaScript in both social network iframes records the timestamp of the event and initiate requests to the server including that timestamp.
>> 6. the server infers that because user="NicholasDoty" and user="unknown1234" (perhaps repeatedly) have NetworkInformation events with the same timestamp, they are likely the same user.
>> 
>> I recognize there are other means that servers may use to attempt such inferences today; I think it's still worthwhile to mitigate threats that make such correlation potentially much easier.
>> 
> <Bryan> I would think that the statistical probability, in popular social networks, would be pretty good that network change events happen at the same time for different users. I would further wonder about socnets (or search engines etc) that attempted such correlation. Nonetheless, there is a *much* simpler way to perform to such privacy attacks... The servers can just look at the source IP address (in most cases it will be the same). With such a simple solution, why would sites go to such obtuse correlation methods as using the NetInfo API? Should the possibility of a such a complex correlation approach (regardless of whether most or any site would ever use it) preclude the much wider (and legitimate) value of this API? 

Apologies if I wasn't clear before; I have tried to explicitly note that it may be possible to use other means (like source IP address) to do such correlation. However, some users may use technology to prevent IP address correlation (onion routing, say) or may be in an environment with a shared IP address. I think it's worthwhile for us to consider ways to avoid making this kind of correlation easier even if it's often possible in practice today. 

I'm specifically not suggesting that because of some privacy concerns we should ignore the value of this API or stop work on it. I believe there are two common mitigations of this kind of privacy concern: 1) not firing events for background windows (which works for some APIs, but perhaps not this one if the background loading use case is expected to be particularly important); 2) allowing fuzzing of the event firing by the UA (which seems to be particularly amenable here, as simultaneous event firing or real-time updates of network connectivity are likely not essential).

Thanks,
Nick
Received on Monday, 20 January 2014 23:12:29 UTC