Re: Follow up issues from WebRTC / PING TPAC meeting from Jan-Ivar Bruaroey on 2019-10-05 (public-privacy@w3.org from October to December 2019)

From: Jan-Ivar Bruaroey <jib@mozilla.com>
Date: Sat, 5 Oct 2019 00:27:22 -0400
To: Pete Snyder <psnyder@brave.com>, public-privacy <public-privacy@w3.org>
Cc: public-webrtc@w3.org
Message-ID: <67b4022f-daa4-cc38-bd31-56c17d51f3ce@mozilla.com>
(Forgot to cross-post, sorry for the repeat! Reply to this one) Thanks 
Peter for the summary and bringing the discussion here.

I'd like to recap my case against double-keying deviceIds (when storage 
isn't) from the original thread, for everyone's benefit. I'd also like 
to respond to Jeffrey Yasskin's comments from that conversation, so I've 
lifted them here (with his permission) with my responses:

On 9/25/19 7:41 PM, Jan-Ivar Bruaroey wrote:
>
> How is that going to work?
>
> Put enumerateDevices() aside for a moment.
>
> Site J (e.g. Jitsi) uses the following to remember a user's chosen 
> camera between visits:
>
>    const stream = await navigator.mediaDevices.getUserMedia({video: {deviceId: localStorage.chosenCamera}});
>    localStorage.chosenCamera = stream.getVideoTracks()[0].getSettings().deviceId;
>
> Sites A and B both iframe J. A user's browser double-keys generated 
> deviceIds but not localStorage:
>
>  1. User selects their secondary USB camera on site A::J. A::J
>     remembers it on revisits.
>  2. User selects their secondary USB camera on site B::J. B::J
>     remembers it on revisits.
>  3. User does 1, then 2, then visits A::J again. The camera is forgotten.
>
> Step 2 changed J::localStorage.chosenCamera from deviceId(a::j) to 
> deviceId(b::j). getUserMedia() in A::J fails to recognize deviceId(b::j).
>
> Fubar.
>

On 9/26/19 5:14 PM, Jeffrey Yasskin wrote (lifted excerpts):
> Pete and I have already established that we disagree on this, but to 
> restate the other position:
> 1. The theoretical privacy harm from an unpartitioned media device ID 
> is that it can cause the user to be correlated across top-level origins.
> 2. In browsers that do not ship partitioned storage, the user is 
> already trivially correlated across top-level origins using that storage.
> 3. Thus, the media device ID cannot *cause* the user to be correlated.
> 4. So it causes no privacy harm that needs to be solved. That'll only 
> change when storage gets partitioned.
This.
> To confuse a website about a user's ID, all of that site's (tainted) 
> storage has to be cleared at once, since otherwise it can just store 
> the identifier in the bit of storage that's not cleared and 
> re-establish the ID from there. I believe the WG's intent (WG folks 
> correct me if I'm wrong) is to say to rotate device IDs at that epoch 
> boundary. I think the spec doesn't say this well, and we could help by 
> suggesting more precise language and, longer term, by defining 
> appropriate terms for this in PING or TAG publications.
This would be great! An ideal implementation would clear an id when no 
references to it are stored.
> I'd like to hear from the WG members if anything breaks semantically 
> if the device IDs are per-origin counters instead of effectively big 
> random numbers.
Nothing would break semantically, but I think Youenn is right any 
counter becomes an id over time as devices are added/removed. Plus, if 
the "per-origin" part is poorly implemented, the counter might correlate 
across origins (users change tabs often, devices rarely). Users with 
many devices would stand out. That sounds worse than what we have (in 
the spec) now.

A simpler way to reduce entropy might be to mandate a much smaller 
deviceId length. Current implementations seem a bit out of control here 
(ours included).

> https://w3c.github.io/mediacapture-main/#dom-mediadeviceinfo-deviceid mentions 
> unguessability, but I didn't see a reason why that would be needed.
I suspect we just meant nondeterministic across origins.

> I did see concerns about implementation complexity in 
> https://github.com/w3c/mediacapture-main/issues/607, which are hard to 
> dismiss without a clear privacy harm.

On the upside, I think we all agree deviceIds MUST be double-keyed when 
storage is, or the API doesn't work.

I have some more ideas, but wanted to post this first.

.: Jan-Ivar :.
Received on Saturday, 5 October 2019 04:27:48 UTC