Re: Towards a getUserMedia/enumerateDevices fingerprinting solution

> On Feb 10, 2019, at 7:47 AM, Harald Alvestrand <harald@alvestrand.no> wrote:
> 
> Den 07.02.2019 19:05, skrev youenn fablet:
>> As shown
>> by https://www.chromestatus.com/metrics/feature/timeline/popularity/1119, enumerateDevices
>> is probably used for fingerprinting purposes.
> 
> However, I'm not sure the data actually supports this.
> I looked at the same data through another lens, and that showed the
> usage to be almost flat over the last 3 months (somehow the 1-year graph
> failed to show).
> 
> It's possible that the jumps in the top graph indicate when the counter
> was rolled out, not when the feature started to be used.
> 
> The second graph shows an usage pattern that is falling, not rising -
> again, it does not correlate with the graph above.
> 
> It would be great to have some verification that the usage of
> enumerateDevices is indeed unrelated to the page potentially wanting to
> use those devices.

From my reading of https://www.chromestatus.com, enumerateDevices is used on 1.8% of the pages while getUserMedia is used on less than 0.01% of the pages.
We also have internal evidence that web sites that are never calling getUserMedia are calling enumerateDevices.
My suspicion is that they are doing so to fingerprint users.

> 
>> A thread started on GitHub
>> (https://github.com/w3c/mediacapture-main/issues/559) to tackle this issue.
>> 
>> The editors are seeking feedback on the following assumptions.
>> 
>> 1. It is assumed that:
>> - Leaking any fingerprinting information silently through
>> enumerateDevices is an issue.
> 
> I am not certain there's any consensus here. The WG, when designing this
> iteration of enumerateDevices, has formerly decided that leaking:
> 
> a) whether or not camera and/or microphone is present
> b) the number of audio and video devices present
> c) whether these devices are the same as when the page was previously
> granted access to them (aka "device ID is stable")
> 
> is an acceptable amount of leakage.

Thanks Harald for pointing that out.
Hearing about usages of webrtc from real web sites is indeed extremely important.
The assumptions I described seem to follow what regular web sites using webrtc are doing, it is essential to gather all possible inputs.

I would mention a few points related to the above description:
1. getUserMedia/enumerateDevices can be implemented in a way that we (Safari) believe is web compatible and that respects user privacy by not having enumerateDevices/getUserMedia leaking any information silently.
2. enumerateDevices as defined by the current specification is leaking more information than described above.
3. the amount of information leakage does not seem to relate to the requirements described in https://tools.ietf.org/html/draft-ietf-rtcweb-use-cases-and-requirements-16#section-4.2.

Unfortunately, we have not found a way to implement both point 1 and be spec compliant.
We are seeking ways to update the specification so that we can implement point1 and be spec compliant.

More specifically about the 3 points mentioned above:
a) can be discovered through getUserMedia, no need for enumerateDevices for that purpose.
b) does not seem needed in any regular webrtc application, at least before getUserMedia is granted.
c) deviceID stability is indeed important, as Stefan mentions. Applications might only need device IDs of devices that were being used to capture to be stable. It is unclear to me why they need to know about new devices for instance.

> 
>> - Leaking any fingerprinting information silently through getUserMedia
>> is an issue.
>> - Leaking some fingerprinting information after a getUserMedia prompt is
>> ok, even if user denied access.
>> - Leaking all capture device information is ok if a web page is granted
>> capture access.
>> 
>> 2. enumerateDevices is widely used to implement capture device pickers.
>> Capture device pickers are usually showing previews of the capture device.
>> It is assumed that:
>> - In regular flows, enumerateDevices is called after getUserMedia access
>> is granted.
>> 
>> 3. enumerateDevices may be used to check for microphone/camera existence
>> If no microphone or camera is exposed in enumerateDevices, web sites
>> might not call getUserMedia at all.
>> Web sites can discover that capture devices are missing by checking for
>> NotFoundError in case of getUserMedia promise rejection.
>> It is assumed that:
>> - Getting capture device presence through getUserMedia is good enough so
>> that enumerateDevices might not be required to always accurately provide
>> this information.
>> 
>> Any feedback most welcome,
>> Y
> 
> 

Received on Monday, 11 February 2019 06:52:37 UTC