Re: Proposed comments and qs on media capture streams from Nick Doty on 2015-07-07 (public-privacy@w3.org from July to September 2015)

From: Nick Doty <npdoty@w3.org>
Date: Mon, 6 Jul 2015 21:04:19 -0700
To: "public-privacy (W3C mailing list)" <public-privacy@w3.org>
Cc: Christine Runnegar <runnegar@isoc.org>, Joseph Lorenzo Hall <joe@cdt.org>, gnorcie@cdt.org, Mike O'Neill <michael.oneill@baycloud.com>, Katie Haritos-Shea GMAIL <ryladog@gmail.com>
Message-Id: <3387F1D6-0814-47CE-ADBB-7F4734E33B6E@w3.org>

Building on the work that Christine did in summarizing the teleconference discussions and trying to consolidate that with my comments, Greg and Joe's comments, Mike's comments, Katie's related comments and subsequent email discussion, below is a proposed set of comments to send to the Media Capture Task Force.

Any mistakes and omissions are surely my own; let us know if anything is missing or mis-stated before we send these comments along to the appropriate WG mailing lists in the next couple of days.

Thanks,
Nick

## Comments/Questions on Media Capture Streams – Privacy and Security Considerations

Our input is intended to help the Media Capture Task Force produce a more privacy-protecting API.

This feedback is based on discussions within the Privacy Interest Group and various email threads, consolidated as much as possible on similar topics. Also, some comments were provided in October 2014 on Media Stream recording, which may also be relevant:
https://lists.w3.org/Archives/Public/public-privacy/2014OctDec/0004.html

The Media Capture Task Force might be interested in documents (in-progress, and themselves needing review and feedback) to aid in identifying and mitigating privacy issues in Web specifications, which have been used as part of this review, including:
* https://w3ctag.github.io/security-questionnaire/
* https://w3c.github.io/fingerprinting-guidance/

Comments below have been combined into categories.

### Consent/Permissions

Do permissions carry forward across sessions? Could there be a built-in sunset period for permissions? European participants in particular have raised this concern as it may be related to legal compliance in the EU.

It would be nice if there was a simple, user friendly way to revoke consent for a stream (especially audio/webcam streams). As it currently stands, once consent is granted there doesn't seem to be simple way to revoke it.

> "when the page is secure"

"secure" is a word that often gets defined in different ways. Would it be more precise to refer to "privileged contexts"?
http://www.w3.org/TR/powerful-features/#settings-privileged

Not persisting permissions in such settings is a good base-line requirement. Section 10.6 states that persistent permissions must be be served over HTTPS and have no mixed content. It would be nice to see the definition of mixed content expanded to include the various issues mentioned in Bonneau's recent paper[1]. For example, if a site elects to use pinning, it should be considered to have mixed content if it loads non-pinned content.

[1] http://www.jbonneau.com/doc/KB15-NDSS-hsts_pinning_survey.pdf

[Note: This last point is perhaps also relevant to http://www.w3.org/TR/mixed-content/]

You've heard from the TAG already about whether use of the API ever makes sense in unprivileged contexts. That is, when the user is asked for permission to access their camera, do they understand that they're granting this permission to all network attackers as well as the site they think they're talking to? I suspect this PING email thread is not going to change your minds about that already discussed topic. However, it would be worthwhile to note this security threat in the security considerations section and to note for user agent implementers the difficulty for this permission prompt.

Best Practice 2 is in a section entitled "Implementation Suggestions", but contains a normative MUST statement. If this is an interoperability requirement and MUST is defined as in 2119, then I think "suggestions" (and indeed, "best practice") is probably incorrect terminology.

Permissions for getUserMedia seem to be specific to entry script origin. Is this what users will expect? For example, if I grant and persist permission to callmyfriends.com to use their service and later I browse to example.com which has an embedded iframe of callmyfriends.com, will users be shocked to see their camera turn on and a picture of themselves on the screen? Permission breadth may be a flexible option for the user agent ("Optionally, e.g., based on a previously-established user preference, for security reasons"), but it might be useful for the spec to establish some expectations here. Top-level origin/embedded origin pairs, for example, might be a useful model, as in some implementations of Geolocation.

### Device enumeration

Why is there no requirement for user permission before a platform detects how many devices of each class are connected/available? Does the specification provide a mechanism to allow a user agent to deny access if an application is not in use?

Can we specify the order in which devices should be listed? If this will vary, it will make it that much easier to fingerprint the user agent, based not only on what kinds of devices they have attached, but what order the software happened to list those devices. (For example, see our experience with font listing.)

Does this need to be enumerable? Fingerprintability of plugins, for example, can be dramatically reduced by changing it to a query model. Does the user have a camera attached? Does the user have a microphone attached? If so, the site can then ask the user for permission and when they do so, they can get deviceIds, kinds and grouping of devices, labels, etc. Related: what purpose does the deviceId serve prior to granting of permission (as opposed to just knowing the kinds/capabilities)? Does the site need to know that I have a microphone and a grouped microphone/webcam and a separate webcam *before* asking me for permission to access my camera? If the enumeration and identifiers are only present after asking for permission, then no additional permission prompt is needed and leakage of information can be reduced.

Imagine that you were writing a browser that wanted to reduce fingerprinting and was willing to limit functionality but didn't want to drop functionality altogether. Is there any compliant way for that browser to indicate prior to the permission prompt, "yes, video/audio are supported" without enumerating the configuration of devices? It seems like user agents are given some flexibility on how they select constraints for the constrainability pattern, can we provide similar flexibility as to how they indicate capabilities rather than device enumeration? As it is now, it appears that user agents that want to block access to a list of attached webcam devices have to completely block use of WebRTC, even when there's a permission grant; an unfortunate and unnecessary loss of functionality for those who are concerned about this source of fingerprintability.

Finally, can we mark the fingerprintability of the device enumeration section?
http://w3c.github.io/fingerprinting-guidance/#mark-fingerprinting

### Identifiers

> All enumerable devices have an identifier that MUST be unique to the application and persistent across browsing sessions.

Is there a reason why the specification does not go the extra step of recommending that platforms not use persistent identifiers? What are the use cases for the use of persistent identifiers? Could identifiers change between sessions rather than simply treating identifiers as other persistent storages (e.g. cookies)?

To say that such an identifier MUST persist across browsing sessions is a guarantee that the requirement won't be satisfied. Many users, for example, configure their browsers to delete all cookies on closing the browser. How about:
> "Identifiers MAY be persisted across browsing sessions. Persistent identifiers let the application save, identify the availability of, and directly request specific sources."

Any site that assumes that identifiers will persist will set themselves up for failure (for example, when the user clears cookies); the spec should not encourage that false assurance.

What protections are in place to ensure against leakage of identifiers? "unique to the application" does not seem to be fully or clearly defined. Does that mean "unique among all deviceIds available to a particular origin"? Or does it mean "different from the deviceId presented for the same device to all other origins"? On the mailing list, a proposal has been mentioned to double-key these identifiers (on the origin of the top-level document and the embedded iframe, presumably) but that has not yet been detailed.

Per our teleconference conversation about this some months ago, it's not entirely clear why a GUID is necessary rather than, say, ["1","2","3"]. In any case, specifying exactly what the scope of an identifier is and how it differs in different contexts is important.

There are some concerns about access to a persistent deviceId identifier, prior to user-granted permission for accessing camera or microphone, because of the duplication of cookie-like functionality. See: http://www.w3.org/mid/010d01d0b195$4ef90350$eceb09f0$@baycloud.com

### Indicators

Would it be possible to include mechanisms in the specification to display indicators beyond merely that the device is in use? (e.g. that a permission is persistent) Or, how is it expected that this will be handled?

Are indicators provided regarding providing access to camera/microphone in insecure or unprivileged contexts? Can we give guidance regarding warning users that access to their camera may be provided to all network attackers?

### Events

> When a new media input or output device is made available, the user agent MUST queue a task fires a simple event named devicechange at the MediaDevices object.

This event appears to be fired even for web pages that have not requested any permissions from the user. Is that intended?

Particularly if this event will be fired before any permission is granted, it is important that it not be fired simultaneously in all browsing contexts. Sites can use simultaneous firing to correlate browsing activity in different tabs, different windows (including private windows), different browsers, in a way that may be unexpected to the user and undermine other protections they're attempting to implement. Some specs have resolved this problem by noting that the event should only be fired for the front-most or active browsing context.

(This may not be a problem for the other events, which are specific to a media stream already accessed by a script. Muting could be an event that would be fired for all open audio streams at once, which might reveal some information, but that seems like a much lesser concern since the user would have already granted specific permission to those sites to access media from the user.)

### Privacy considerations

It's a fine model to have the security and privacy considerations section be a summary of normative requirements noted elsewhere, rather than adding them after the fact, as it were.

However, not all of the comments in this section seem to correspond to normative requirements. For example,

> In the case of a case-by-case authorization, it is important that the user be able to say "no" in a way that prevents the UI from blocking user interaction until permission is given - either by offering a way to say a "persistent NO" or by not using a modal permissions dialog.

There are no apparent normative requirements regarding the modality of the permissions dialog. If this is important and intended to be an interoperability requirement, it should be specified as such in the getUserMedia method description.

The "note" section includes a description of a very serious attack. Is there anything that can be done about this beyond a note to website implementers, who may never read this section of the specification? Is it the case that any site that requests getUserMedia permission that subsequently suffers any sort of XSS vulnerability or URL parameter failure as you note will silently give live access to the user's video/audio to an attacker? As a site developer, am I liable if I use getUserMedia in one part of my site, users persist the permission and then somewhere else on my site I have a bug that allows for XSS or a URL parameter failure?

One way to help developers avoid this catastrophic scenario would be to allow sites to indicate whether they were confident about opting in to persisted permissions. A casual developer who calls getUserMedia() wouldn't have to worry about that failure due to some bug on their site. A serious developer who is confident about their security situation and whose functionality would benefit greatly from persistence could call getUserMedia(allowPersist=true).

### Local IP address

Some discussions on the mailing list have touched on broader concerns about WebRTC and access to a user's local IP address. We understand that to be an ongoing discussion. The Privacy Interest Group is maintaining a wiki page on that topic in order to respond to that separate request: http://www.w3.org/wiki/Privacy/IPAddresses

Received on Tuesday, 7 July 2015 04:04:29 UTC