W3C home > Mailing lists > Public > public-privacy@w3.org > July to September 2015

local IP address (was Re: Request for feedback: Media Capture and Streams Last Call)

From: Nick Doty <npdoty@w3.org>
Date: Sat, 15 Aug 2015 19:48:17 -0700
Cc: Mike O'Neill <michael.oneill@baycloud.com>, "public-privacy (W3C mailing list)" <public-privacy@w3.org>, Jan-Ivar Bruaroey <jib@mozilla.com>
Message-Id: <DB088BB5-6516-4F06-9E50-BE66BDA803AD@w3.org>
To: Eric Rescorla <ekr@rtfm.com>
It occurred to me that as I was in transit during this discussion before, I couldn't reply to all questions at the time. Trying to fill in points I missed in June below. —npd

> On Jun 30, 2015, at 2:54 PM, Eric Rescorla <ekr@rtfm.com> wrote:
> 
> On Tue, Jun 30, 2015 at 2:49 PM, Nick Doty <npdoty@w3.org <mailto:npdoty@w3.org>> wrote:
> That's true, many fingerprinting mechanisms are difficult to provide transparency for. I don't take that as an argument that all future features should be equally so, such that no improvement in any feature can improve the overall situation. Canvas has seen some blocking for that reason; wouldn't it be better to avoid that with WebRTC if it's possible to do so?
> 
> I think that depends on the impact on the deployability of WebRTC, no?

I'm not sure I understand this question. Maybe the question is whether removing this kind of fingerprinting (pre-permission-prompt access to a list of device identifiers or local IP address) would prevent the deployability of WebRTC altogether. I haven't yet learned why having drive-by access to those pieces of information is important to the functionality of WebRTC, as it seems that in many typical cases, those pieces of information are functionally useful prior to initiating a stream of some kind which typically would require user permission.

>>> The webRTC standard is very troublesome from a privacy standpoint in other
>>> ways. Not only all your local IP addresses  (inside the NAT) visible,
>> 
>> I'd really like to see a good analysis for why you think addresses behind
>> the NAT are
>> especially sensitive:
>> 
>> 1. They are drawn from very small set of addresses (comparatively).
>> 2. They change every time your public IP address changes.
>> 
>> VPNs are different, of course, but then it's not clear how good a job VPNs
>> do
>> of preserving privacy.
> 
> I think Wendy has been starting a list for us of some of the concerns around local IP address discoverability. On the NAT question, off the top of my head I'm aware of the problems of: 1) easier browser fingerprinting;
> 
> I keep hearing people say that it makes browser fingerprinting easier, but the points
> I raised above seem to call this argument into question, so I'd appreciate a real
> analysis.

I'm not sure what will satisfy as a "real" analysis here. As I understand it, we're seeing multiple instances of use of drive-by WebRTC access to local IP address that is clearly being used for tracking rather than for real-time communications functionality. Do we think the parties doing that are deploying that code for no reason? Or instead that they're seeing an advantage in discriminating between users who otherwise looked mostly identical behind a NAT? (It might be that the New York Times advertiser is using it in order to do better geo-targeting of ads to those who are behind a work VPN, rather than to improve their fingerprinting/targeting of users behind a local NAT; we'd have to ask them, I guess.)

The number of local IP addresses is indeed smaller. But the local IP address can be a discriminator that's orthogonal to other characteristics. For example, if 50 students in a classroom are all using their administrator -maintained Chromebooks to access online educational resources, and also their personal emails/social networking sites, the local IP address could allow re-identification of a student, even after all their local storage is cleared at the end of the class by the administrator. The school's external IP address may not change very often; in the meantime, the local IP address will be a useful way to fingerprinting the devices that actual have similar hardware/software characteristics.

That is, the number of bits is not necessarily very high (though it can be significant), but the bits of entropy are likely to be present even in scenarios where otherwise browser fingerprinting could be relatively difficult, like in small institutional settings.

> 2) facilitating attacks on local network infrastructure.
> 
> Well, again, given that the IPs are drawn from a pretty small pool, it's not clear
> that it's actually that hard to discern which of these pools they are from.

I'm also no expert in this kind of attack, so I'm not sure how much of a difference it makes when trying to conduct an attack on a router if you can more easily discover the local IP address. I do know that some published attacks did use things like Java plugins to discover local IP address to facilitate their attacks. For example, http://samy.pl/natpin used Java.net <http://java.net/>.socket on Firefox/Opera to ease that NAT pinning attack, and I think the same would be useful for the http://samy.pl/vzwfios/ <http://samy.pl/vzwfios/> Router XSS attack. Having access to that information could make it harder to detect that kind of attack or build countermeasures; for example, maybe my browser could see a series of outgoing connections to every common local router IP address and stop the suspicious script.



Received on Sunday, 16 August 2015 02:48:29 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 16 August 2015 02:48:29 UTC