Re: Comments/Questions on Media Capture Streams – Privacy and Security Considerations from Harald Alvestrand on 2015-07-14 (public-privacy@w3.org from July to September 2015)

From: Harald Alvestrand <harald@alvestrand.no>
Date: Tue, 14 Jul 2015 14:13:01 +0200
To: Nick Doty <npdoty@w3.org>, public-media-capture@w3.org
CC: "public-privacy (W3C mailing list)" <public-privacy@w3.org>
Message-ID: <55A4FCCD.1040105@alvestrand.no>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Thank you for your comments!

This is obviously material that needs input from the group on how we
handle, but some questions that I as process manager have on these
comments:

- - The specification as it stands represents results of long debates.
Part of these debates are documented in the IETF security documents
for RTCWEB. Can we assume that these documents have been read and
understood for further commenting?

- - We have understood the style of specification in the W3C to be that
user interface issues (such as what indicators to display, and how
permission is requested) are strictly outside of the remit of the
specification. We can require that permission be granted, and that an
indicator be shown, but its exact form is an implementation matter. Is
that a common understanding we can assume here too?

- - The fingerprinting guidance document has the status (according to
itself) of "unofficial draft", and does not link to any working group
or mailing list. What can we expect about a declaration of consensus
on this specification in the future? Is it on someone's roadmap to
declare consensus on it?

Thanks in advance for enlightenement on these topics!

  Harald, chair hat on


Den 14. juli 2015 04:28, skrev Nick Doty:
> Hi Media Capture Task Force,
> 
> The Privacy Interest Group has been discussing the Media Capture 
> and Streams Last Call from a privacy perspective. Some discussion 
> has already taken place on public-privacy and other lists, but 
> we've tried to consolidate feedback here. Please include the 
> public-privacy list in followups.
> 
> Hope this helps, Nick Doty, for W3C Privacy Interest Group (PING)
> 
> 
> ## Comments/Questions on Media Capture Streams – Privacy and 
> Security Considerations
> 
> Our input is intended to help the Media Capture Task Force produce 
> a more privacy-protecting API.
> 
> This feedback is based on discussions within the Privacy Interest 
> Group and various email threads, consolidated as much as possible 
> on similar topics. Also, some comments were provided in October 
> 2014 on Media Stream recording, which may also be relevant: 
> https://lists.w3.org/Archives/Public/public-privacy/2014OctDec/0004.html
>
>
> 
The Media Capture Task Force might be interested in documents
> (in-progress, and themselves needing review and feedback) to aid in
> identifying and mitigating privacy issues in Web specifications, 
> which have been used as part of this review, including: * 
> https://w3ctag.github.io/security-questionnaire/ * 
> https://w3c.github.io/fingerprinting-guidance/
> 
> Comments below have been combined into categories.
> 
> ### Consent/Permissions
> 
> Do permissions carry forward across sessions? Could there be a 
> built-in sunset period for permissions? European participants in 
> particular have raised this concern as it may be related to legal 
> compliance in the EU.
> 
> It would be nice if there was a simple, user friendly way to revoke
> consent for a stream (especially audio/webcam streams). As it
> currently stands, once consent is granted there doesn't seem to be
> simple way to revoke it.
> 
>> "when the page is secure"
> 
> "secure" is a word that often gets defined in different ways.
> Would it be more precise to refer to "privileged contexts"? 
> http://www.w3.org/TR/powerful-features/#settings-privileged
> 
> Not persisting permissions in such settings is a good base-line 
> requirement. Section 10.6 states that persistent permissions must 
> be be served over HTTPS and have no mixed content. It would be
> nice to see the definition of mixed content expanded to include
> the various issues mentioned in Bonneau's recent paper[1]. For
> example, if a site elects to use pinning, it should be considered
> to have mixed content if it loads non-pinned content.
> 
> [1] http://www.jbonneau.com/doc/KB15-NDSS-hsts_pinning_survey.pdf
> 
> [Note: This last point is perhaps also relevant to 
> http://www.w3.org/TR/mixed-content/]
> 
> You've heard from the TAG already about whether use of the API ever
> makes sense in unprivileged contexts. That is, when the user is
> asked for permission to access their camera, do they understand 
> that they're granting this permission to all network attackers as 
> well as the site they think they're talking to? I suspect this
> PING email thread is not going to change your minds about that
> already discussed topic. However, it would be worthwhile to note
> this security threat in the security considerations section and to
> note for user agent implementers the difficulty for this
> permission prompt.
> 
> Best Practice 2 is in a section entitled "Implementation 
> Suggestions", but contains a normative MUST statement. If this is 
> an interoperability requirement and MUST is defined as in 2119, 
> then I think "suggestions" (and indeed, "best practice") is 
> probably incorrect terminology.
> 
> Permissions for getUserMedia seem to be specific to entry script 
> origin. Is this what users will expect? For example, if I grant
> and persist permission to callmyfriends.com
> <http://callmyfriends.com> to use their service and later I browse
> to example.com <http://example.com> which has an embedded iframe
> of callmyfriends.com <http://callmyfriends.com>, will users be
> shocked to see their camera turn on and a picture of themselves on
> the screen? Permission breadth may be a flexible option for the
> user agent ("Optionally, e.g., based on a previously-established
> user preference, for security reasons"), but it might be useful for
> the spec to establish some expectations here. Top-level
> origin/embedded origin pairs, for example, might be a useful model,
> as in some implementations of Geolocation.
> 
> ### Device enumeration
> 
> Why is there no requirement for user permission before a platform 
> detects how many devices of each class are connected/available? 
> Does the specification provide a mechanism to allow a user agent
> to deny access if an application is not in use?
> 
> Can we specify the order in which devices should be listed? If
> this will vary, it will make it that much easier to fingerprint the
> user agent, based not only on what kinds of devices they have
> attached, but what order the software happened to list those
> devices. (For example, see our experience with font listing.)
> 
> Does this need to be enumerable? Fingerprintability of plugins, for
> example, can be dramatically reduced by changing it to a query 
> model. Does the user have a camera attached? Does the user have a 
> microphone attached? If so, the site can then ask the user for 
> permission and when they do so, they can get deviceIds, kinds and 
> grouping of devices, labels, etc. Related: what purpose does the 
> deviceId serve prior to granting of permission (as opposed to just 
> knowing the kinds/capabilities)? Does the site need to know that I 
> have a microphone and a grouped microphone/webcam and a separate 
> webcam *before* asking me for permission to access my camera? If 
> the enumeration and identifiers are only present after asking for 
> permission, then no additional permission prompt is needed and 
> leakage of information can be reduced.
> 
> Imagine that you were writing a browser that wanted to reduce 
> fingerprinting and was willing to limit functionality but didn't 
> want to drop functionality altogether. Is there any compliant way 
> for that browser to indicate prior to the permission prompt, "yes, 
> video/audio are supported" without enumerating the configuration
> of devices? It seems like user agents are given some flexibility
> on how they select constraints for the constrainability pattern,
> can we provide similar flexibility as to how they indicate
> capabilities rather than device enumeration? As it is now, it
> appears that user agents that want to block access to a list of
> attached webcam devices have to completely block use of WebRTC,
> even when there's a permission grant; an unfortunate and
> unnecessary loss of functionality for those who are concerned about
> this source of fingerprintability.
> 
> Finally, can we mark the fingerprintability of the device 
> enumeration section? 
> http://w3c.github.io/fingerprinting-guidance/#mark-fingerprinting
> 
> ### Identifiers
> 
>> All enumerable devices have an identifier that MUST be unique to 
>> the application and persistent across browsing sessions.
> 
> Is there a reason why the specification does not go the extra step 
> of recommending that platforms not use persistent identifiers?
> What are the use cases for the use of persistent identifiers?
> Could identifiers change between sessions rather than simply
> treating identifiers as other persistent storages (e.g. cookies)?
> 
> To say that such an identifier MUST persist across browsing 
> sessions is a guarantee that the requirement won't be satisfied. 
> Many users, for example, configure their browsers to delete all 
> cookies on closing the browser. How about:
>> "Identifiers MAY be persisted across browsing sessions. 
>> Persistent identifiers let the application save, identify the 
>> availability of, and directly request specific sources."
> 
> Any site that assumes that identifiers will persist will set 
> themselves up for failure (for example, when the user clears 
> cookies); the spec should not encourage that false assurance.
> 
> What protections are in place to ensure against leakage of 
> identifiers? "unique to the application" does not seem to be fully 
> or clearly defined. Does that mean "unique among all deviceIds 
> available to a particular origin"? Or does it mean "different from 
> the deviceId presented for the same device to all other origins"? 
> On the mailing list, a proposal has been mentioned to double-key 
> these identifiers (on the origin of the top-level document and the 
> embedded iframe, presumably) but that has not yet been detailed.
> 
> Per our teleconference conversation about this some months ago, 
> it's not entirely clear why a GUID is necessary rather than, say, 
> ["1","2","3"]. In any case, specifying exactly what the scope of
> an identifier is and how it differs in different contexts is 
> important.
> 
> There are some concerns about access to a persistent deviceId 
> identifier, prior to user-granted permission for accessing camera 
> or microphone, because of the duplication of cookie-like 
> functionality. See: 
> http://www.w3.org/mid/010d01d0b195$4ef90350$eceb09f0$@baycloud.com
> 
> ### Indicators
> 
> Would it be possible to include mechanisms in the specification to
>  display indicators beyond merely that the device is in use? (e.g. 
> that a permission is persistent) Or, how is it expected that this 
> will be handled?
> 
> Are indicators provided regarding providing access to 
> camera/microphone in insecure or unprivileged contexts? Can we
> give guidance regarding warning users that access to their camera
> may be provided to all network attackers?
> 
> ### Events
> 
>> When a new media input or output device is made available, the 
>> user agent MUST queue a task fires a simple event named 
>> devicechange at the MediaDevices object.
> 
> This event appears to be fired even for web pages that have not 
> requested any permissions from the user. Is that intended?
> 
> Particularly if this event will be fired before any permission is 
> granted, it is important that it not be fired simultaneously in all
> browsing contexts. Sites can use simultaneous firing to correlate
> browsing activity in different tabs, different windows (including
> private windows), different browsers, in a way that may be
> unexpected to the user and undermine other protections they're 
> attempting to implement. Some specs have resolved this problem by 
> noting that the event should only be fired for the front-most or 
> active browsing context.
> 
> (This may not be a problem for the other events, which are
> specific to a media stream already accessed by a script. Muting
> could be an event that would be fired for all open audio streams at
> once, which might reveal some information, but that seems like a
> much lesser concern since the user would have already granted
> specific permission to those sites to access media from the user.)
> 
> ### Privacy considerations
> 
> It's a fine model to have the security and privacy considerations 
> section be a summary of normative requirements noted elsewhere, 
> rather than adding them after the fact, as it were.
> 
> However, not all of the comments in this section seem to
> correspond to normative requirements. For example,
> 
>> In the case of a case-by-case authorization, it is important
>> that the user be able to say "no" in a way that prevents the UI
>> from blocking user interaction until permission is given - either
>> by offering a way to say a "persistent NO" or by not using a
>> modal permissions dialog.
> 
> There are no apparent normative requirements regarding the
> modality of the permissions dialog. If this is important and
> intended to be an interoperability requirement, it should be
> specified as such in the getUserMedia method description.
> 
> The "note" section includes a description of a very serious
> attack. Is there anything that can be done about this beyond a note
> to website implementers, who may never read this section of the 
> specification? Is it the case that any site that requests 
> getUserMedia permission that subsequently suffers any sort of XSS 
> vulnerability or URL parameter failure as you note will silently 
> give live access to the user's video/audio to an attacker? As a 
> site developer, am I liable if I use getUserMedia in one part of
> my site, users persist the permission and then somewhere else on
> my site I have a bug that allows for XSS or a URL parameter
> failure?
> 
> One way to help developers avoid this catastrophic scenario would 
> be to allow sites to indicate whether they were confident about 
> opting in to persisted permissions. A casual developer who calls 
> getUserMedia() wouldn't have to worry about that failure due to 
> some bug on their site. A serious developer who is confident about 
> their security situation and whose functionality would benefit 
> greatly from persistence could call 
> getUserMedia(allowPersist=true).
> 
> ### Local IP address
> 
> Some discussions on the mailing list have touched on broader 
> concerns about WebRTC and access to a user's local IP address. We 
> understand that to be an ongoing discussion. The Privacy Interest 
> Group is maintaining a wiki page on that topic in order to respond 
> to that separate request: 
> http://www.w3.org/wiki/Privacy/IPAddresses
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQIcBAEBAgAGBQJVpPzFAAoJEGsBVt6Jonw0dcMQAJ8kXnr8hBvAMNpiOhQtZt/I
mxEabllS1NlAQJ5WKGKmj2TmmJcYw9b6bVKVhFQ8lacxDTQ5c9cJR6QtVlyXlr1i
NFCiId/kmipmmo3LemCJPJ0S7SHzAowfxpgTyoDxA5b7XCqm2meufASuOyV968wm
z867DkzpY46JiHqRawWICEbFW6hOx1ooHJU9TcQQF2Y8YrTpDqbgtTh8ZBNdn2Sw
g99MwXyKLusAGoHI9U+fcrBnJQusnvI5Ua/cxa0a4K3DNc3IZQerKca5WUdYTrtG
6O8lCjCVMdj/NXq7QgCNSUGj81zaUy5sxlJt9SVg6+AhemCxmiEueZBmOWIvsE7R
X63CZZ9pwfBCni/cJOc8fRPcVSY+Ufyw+Kb5cltomtny7j3MQEct2VCc1WtM2HfO
7MQNJRBglBir1ABRsPMNQIWrItmlRAVF6eSVhTw5CNCunDQtV3hTZDgWNzkxXO+Y
U6PLaeJnDKTQB3j2K5eWDreXSmAIIW5WgY9PxAA4hRIivNf5i96Ocude7bB1yIDk
t6G1wjqMNtUdbrOiEolt86jxxnp4+ddRGPcBqALIRjmiICoVlCFQHdjczu1Zwexb
u00ZjWxYKH5wfBUgpDNXMIa9rxV+UxQcppvLsWcMBiLB8yKm0ak9EwlRISkZk0Kc
ulgwGHFGSVTRRXx8JMG/
=V+Ph
-----END PGP SIGNATURE-----
Received on Tuesday, 14 July 2015 12:13:36 UTC