Re: REPOST: getUserMedia Privacy Review

Reviewing the getUserMedia spec more recently, I noticed that the sourceIDs look like they are most likely going to be long-lived, high-entropy unique identifiers for a particular device, though dependent on the specific Web application.  These IDs have some likelihood to be usable as unique IDs for the user.

http://dev.w3.org/2011/webrtc/editor/getusermedia.html#idl-def-MediaSourceStates

From a read of the spec and a conversation with one of the editors, I don't understand yet under what circumstances these IDs can be enumerated, and how large the attack surface for user re-identification really is.

Dan, Cullen, Harald -- care to expand on that?

Thanks!



On 2013-06-20, at 15:09 -0700, Frank.Dawson@nokia.com wrote:

> Hei PINGers.
> Repost of my privacy review comments from a few weeks ago.
> Frank/
> _____________________________________________
> From: Dawson Frank (Nokia-CIC/Dallas) 
> Sent: 30 May, 2013 16:56
> To: ext Christine Runnegar (runnegar@isoc.org); ext Nicholas Doty; <tjwhalen@gmail.com>
> Cc: Hirsch Frederick (Nokia-CIC/Boston); Dawson Frank (Nokia-CIC/Dallas)
> Subject: getUserMedia Privacy Review
>  
>  
> Hei Christine, Nick and Tara.
> These comments may be late but I am living in a backlogged state, these days. I hope you already received feedback from Frederick Hirsch.
> I am not a WebRTC expert, by any means, but was interested in the navigator.getUserMedia API. I tried searching the internet for articles or blogs that talked about the getUserMedia from an architecture point of view. I wanted to be able to draw a DFD of the system architecture between UA, Proxy Server and Origin server. Could not find such information. Please point it out to me if it exists.
> Since I did not get to know the complete system architecture (IE, how the streaming media works), I thought to use an alternative method of privacy assessment. Instead of first capturing the data flow and then using data catalog to complete threat analysis, I thought I would try a Privacy Data Life Cycle (PDLC) based assessment. This could be scoped by a few user stories. Then, by looking at the stages of PDLC (IE, Collect, Process, Store, Transfer, Maintain = My rendering of these), one can think about various principles at each stage and what safeguards should be considered.
> Reference: I hope I was reading the correct specification (http://www.w3.org/TR/2012/WD-mediacapture-streams-20120628/)
> User stories (to set context for primary purposes of web service and privacy expectation of the user):
> As a web service, I want to capture and transfer an image from the browser,
> As a web service, I want to capture and stream an audio clip from the browser,
> As a web service, I want to capture and stream a camcorder/video clip from the browser,
> As a web service, I want to secure the uploading of a sharing or streaming of an media from the browser,
> As a user, I want to capture and share an image from my browser,
> As a user, I want to share an audio stream from my browser,
> As a user, I want to share a video stream from my browser,
> As a user, I want to control the default of whether my microphone and camera are muted or recording when accessing a web service,
> As a user, I want to be able to mute the sharing of an audio or video stream at any time,
> As a user, I want to be able strip any media of personal information (EG, contact information,  location information, sensor characteristics, etc.) prior to sharing or streaming,
> As a user, I want to be able to secure the transfer of a shared or stream of a media stream from my browser to the web app,
> As a user, I want to know when my media sensors are recording (EG, visual and/or shutter cue for camera, visual cue for microphone).
> (NOTE: I realize that there is a separate specification on user cases for multimedia streaming, but this was my simplified thinking)
> Collection (Assumed to be the capture/record method of API)
> Section 2.2.1 states that when a LocalMediaStream object is created it must be associated with 36 character GUID label. Why is this required to be a GUID rather than a locally unique identifier (LUID)? Why does it need to be an identifier and not a name string?
> Section 3.1.1 states that user agents are encouraged to use the default or system camera or microphone. It is ambiguous, given the current state of tablet and smart phone technology, which camera and microphone in a multi-sensor model of device this would refer to. This text should consider being changed to state that the back-facing camera and lowest quality microphone should be used, when there is only multiple sensors in the sensor class; as the intention is to be less intrusive.  In the event that the web application is a video conferencing, then web application should instruct web page browser to set device to use appropriate camera and microphone prior to invoking capture/recording method.
> Section 3.1.1 state that once user grants permission for capture/recording then user agent “are encouraged to” present some cue that sensor is recording, but the text implies cue for camcorder (IE, “On-Air”) which harkens image of a red recording light outside a studio. The conformance text should be changed to MUST rather than “are encouraged to”.
> For capture/recording cue, there is no guidance on accessibility. Should not there be recommendation to consider multi-factor cues for capture/record to better address different accessibility cases?
> It is unclear from the text what the default state is for the camera and microphone state. It makes sense to me to have explicit conformance text that the default state on invocation of the API method to capture/record is MUTED event state, until user permission is granted.
> There does not appear to be any consideration for conformance or guidance to web application developers and user agent vendors that there needs to be a MUTE BUTTON paradigm needs to be supported to allow the user to MUTE the microphone and camera at anytime during the capture/record state. This could be provided by web app and UA.
> There does not appear to be any consideration for conformance or guidance to web application developers and user agent vendors that a KILL BUTTON paradigm to be supported to allow the user to stop capture/record state, immediately, at any time. This could be provided by web app and UA. If such a KILL BUTTON on capture/record or share/stream was invoked by the user then the user agent and web app MUST purge any residual media stream already received.
> There does not appear to be any consideration for filtering of or turning off tagging of the sensor-specific characteristics from the sensor media stream (EG, contact info, geo-position info, sensor manufacturer, etc). There could be parameters on the getUserMedia to request these be not be included in the media file or stream put into the LocalMediaStream object.
> It is unclear if “permission” on capture/recording is required each time the function is invoked, each time user visits the domain containing the web app with getUserMedia or even if granting permission to capture on one site means other sites get piggy-backed permission. Neither is it clear how long such permission persists. Consent/permission for capture/recording should follow more acceptable, industry, best practices for online consent.
> At the end of a capture/recording state, there should be guidance to the user agent vendors that sensor should be MUTED or turned off.
> Process (Not sure this PDLC stage applies)
> Storage (Assume applies to the assigning an object URL to a LocalMediaStream object, EG: video.src = window.URL.createObjectURL(stream))
> It is unclear, because of my low level of technical understanding of the system architecture, how a user would know where any LocalMediaStream objects get stored as they come from the sensor and get shared/streamed. For file based sharing/streaming, this would be known to the user, as they previously had captured/recorded the media stream into a data store.
> Thinking of a mobile device, is there a possibility that local residual “residue” from getUserMedia invocation would persist within an local storage object that the user agent has access to? I am thinking of transparency to allow a user to wipe these local storage objects in cases such as retiring a device or transferring  a device to another owner.
> Transfer (Assume applies to the function invocation on stream, EG function(stream))
> Since my system architecture understanding for how this API functions is low, it is unclear to me what provisions are to be made for securing the transfer of the audio or video stream from the UA to the origin server containing the web app that is receiving the media transfer/stream. A “Security and Privacy Considerations” section should include guidance to UA and web app on how to provide for secure transfer in use cases when that is required by a web app or user.
> There is little to no guidance on controls that would be provided in the UA and web app to permit user to MUTE or immediate stop capture/recording and subsequent sharing, as stated above. This would apply both to the capture/recording and also sharing/streaming.
> Maintenance (Assume there is no practicality for making provision for how a user would make formal requests of the data controller)
> With this capability a user will have little to no user right of data access , rectification, erasure and user right to object to data processing, it seems.
> Additional findings
> Why is there no “Security and Privacy Considerations” section in this specification where guidance would be given to both user agent vendors implementing this standard and web app developers using the API?
> As it relates to mobile devices, there are now multiple cameras and microphones on smart phones. Yes, multiple microphones, too! It might be my poor comprehension of the specification, but I was not sure how a web application can select a particular sensor in audio and video class. Maybe the parameters are missing? Maybe the capability is beyond this specification. In that case, which one is used? As a user, I may not want to have web service select my front camera but only the back camera. Or if I am more of a voyeur, maybe I do want to select the front facing camera J Likewise on the microphone, how would I select stereo microphone or in the future the “7.1” compatible set of microphones?
> On a similar level, how can I adjust audio or camera to obfuscate or otherwise mask my identity? There should be some consideration for selecting quality levels of media on capture/recording.
> Frank/

Received on Sunday, 3 November 2013 18:55:31 UTC