Notes on security for browser-based screen/application sharing from Eric Rescorla on 2013-03-11 (public-webrtc@w3.org from March 2013)

From: Eric Rescorla <ekr@rtfm.com>
Date: Mon, 11 Mar 2013 13:53:41 -0700
To: rtcweb@ietf.org, public-webrtc@w3.org
Message-ID: <CABcZeBPs=znh-BUCRoVkPC1UuQt-xxf-COD+SGE59ASBzRZbJQ@mail.gmail.com>
1. INTRODUCTION
WebRTC [0][1] already contains facilities for JavaScript applications
to acquire to the user's camera and microphone and either to directly
access the media or to send it elsewhere over a voice/video call.
This obviously presents security issues [2][3] and the consensus
approach is that any access to camera or microphone must only occur
with user consent. Current versions of Chrome obtain this consent once
and persist it indefinitely for a given site. Firefox obtains consent
for every request but will likely eventually add a persistent consent
feature.

One of the major applications of WebRTC-style technology is
videoconferencing and most videoconferencing applications offer either
"screen sharing" or "application sharing" or both. Unfortunately,
while the security properties of camera/microphone access are fairly
obvious to the user--though the properties of persistent permissions
may not be--the security properties of screen/application sharing are
far less obvious. It has been suggested by Adam Barth among others that
permissions should be stricter for screen sharing than for
camera/microphone access. This note provides an overview of the
relevant security issues and of the potential permissions/consent
mechanisms.


2. GENERAL SECURITY PROPERTIES OF SCREEN/APPLICATION SHARING
Technically, screen/application sharing is relatively simple. In
screen sharing, the conference sees whatever is on the user's display;
if there are multiple monitors, typically only one is shared. In
application sharing, the conference gets access to all the windows in
an application.

Because existing conferening products (e.g., WebEx) require some sort
of download/install experience, they end up with the permissions of a
native application. Thus, it doesn't really make sense to worry about
misuse of the sharing permissions specifically because the application
has free run of the user's machine. The security question then becomes
whether the user wishes to run the application at all, not whether he
trusts the application to see his screen specifically.

Even so, there are known security risks to this type of sharing,
mostly due to the user's misunderstanding/lack of thought about
the security properties. For instance:

* The desktop often contains icons that the user has forgotten
  about, including the names of files. These themselves can
  be confidential.

* Desktop notifications such as Growl for incoming messages,
  IMs, etc. can be get shared. These can also contain confidential
  information.

* Users often think of "application sharing" as "window sharing"
  and will have other sensitive documents open at the same
  time as the document they intend to be sharing.

I have heard reports of all of these issues (the first two are also
often seen in settings when the user is projecting their screen at
conferences and the like). Fundamentally, these are examples of user
error, though possibly combined with confusing interfaces. The users
generally understand that they have given the downloaded application
wide permissions. Because users' expectation for Web applications are
that they are safer, there is yet more space for confusion.


3. SECURITY OF SCREEN/APPLICATION SHARING IN THE WEB ENVIRONMENT
3.1. Threat Model
Huang et al.[4] describe the Web security guarantee as:

   Users can safely visit arbitrary web sites and execute scripts
   provided by those sites.

More generally, users expect that the browser will protect them from
malicious sites and that sites are isolated from each other.  (More on
the technical mechanisms below).

Obviously, granting permission to see the desktop breaches this
guarantee to some extent, since the user is granting the site (via the
browser) some very dangerous capabilities. Generally the intent is
that the user can understand the security impact of the permission he
is granting. As should be clear from the discussion above, this is
already not entirely so. However, in the Web environment the problem
is much worse because the user likely thinks that he is assuming
*just* the screen/application sharing risks without the corresponding
"full application privileges" risks. This is unfortunately less true
than one would like.


3.2. Background: Same Origin Policy
In order to isolate sites from each other, browsers implement what's
called the "Same Origin Policy" (SOP). The basic idea is that content
(scripts, HTML, etc.) that runs on one site cannot get access to
content from another site, except under very limited conditions.  For
instance, site A can:

- Run scripts from site B.
- IFRAME HTML from site B but not look at the content or output.
- Display images and videos from site B but not examine their
  contents.

[Note: I am ignoring CORS and WebSockets for the moment.]

The basic idiom here is that site A can cause content from site B to
be *displayed* but it can't access the content itself. This allows for
the construction of some kinds of composite Web sites (i.e., mash-ups)
but still allows for site isolation.

Many important Web security mechanisms depend explicitly on these
guarantees. For instance, consider a Web mail site which bases its
authentication on cookies and will therefore service any HTTP request
which contains the right cookie. Content from any other site can cause
the browser to emit the right HTTP request, but because of the
same-origin policy, it can't see the responses. This prevents random
sites from accessing your web mail.


3.3. Web-Specific Risks
At this point, the risk of combining screen sharing with the Web
environment should be obvious. SOP protection depends on denying
Web content access from site A to content from site B, but
because site A can cause content from site B to be displayed
on the screen, if A can see the user's screen then he can
close the loop and bypass SOP. (We assume below that either
the site is sharing the user's screen or that the browser is
the application being shared.)


3.3.1. SOP Violations for Visible Content
The most obvious attack vector is that the the site can see any
content that the user can see. All he needs to do is to open a window
with the URL of the relevant content and put it in view of the screen
sharing system. Note that the content only needs to be briefly visible
(long enough to be captured by the sharing code). Potential attacks
include:

- Capture the user's webmail (and potentially individual messages).
- Capture the user's "sitekey" anti-phishing picture. [6]
- View any confidential documents that the user has access to
  on the Internet or on their own computer.

In general, any resource which can be opened in a browser window
(if the browser is being shared) or in an external application
(if screen sharing is in use) can be accessed in this fashion.


3.3.2. SOP Violations for "Hidden" Content
The SOP issue extends beyond visible content. For instance, many sites
use secret tokens in HTML content to prevent Cross-Site Resource
Forgery (CSRF) attacks. The idea is that the token is available to
same-origin JS which then embeds it in any XHR requests it
performs. The site checks for presence of the token, thus preventing
content from other sites from performing operations which might have
side effects (even if they cannot see the response). While embedded in
the HTML, these tokens are hidden to avoid annoying the user.

Similar techniques to those described above can be used to bypass this
type of CSRF protection. Instead of loading the content directly in a
window, the attacker loads the HTML in a source view window (using the
view-source: scheme, which both Chrome and Firefox support). Since the
HTML source contains the CSRF token, the attacker can simply read it
off, and use it to mount CSRF attacks.


4. CONSENT/PERMISSIONS OPTIONS
A number of possible permissions models have been proposed.

1. The same permissions model as audio/video, namely a consent
   dialog with (optional?) persistent content. [the
   natural default.]

2. A similar permissions dialog to audio/video but with *only*
   one-time consent. [proposed by Cullen Jennings]

3. A "sysapps" [5] API in which the user had to go through an
   app store type install experience to enable sharing for
   each specific site. [proposed by Adam Barth and others]

There have also been proposals for hybrid designs, such as a
sysapps-style API that also requires an in-chrome permissions dialog
for every sharing instance. Another possible design would be a
"preferred site" desing in which certain sites could directly ask for
permission without an application install but other sites would need
to do an app store install experience.  (Like the Firefox Social API).

The argument for the less onerous permissions models is of course
reduced user friction. The argument for the more restrictive models is
that the set of permissions that is being granted to the web site is
really more similar to those of a Web application install (even though
the user does not know it) and that therefore the barrier to entry
should be more like that of an application (i.e., a curated,
authenticated app store).




ACKNOWLEDGEMENT
Much of this material came out of discussions with Adam Barth,
Cullen Jennings, and Randell Jesup but I may well have mangled it.


[0] http://dev.w3.org/2011/webrtc/editor/webrtc.html
[1] http://dev.w3.org/2011/webrtc/editor/getusermedia.html
[2] http://tools.ietf.org/html/draft-ietf-rtcweb-security
[3] http://tools.ietf.org/html/draft-ietf-rtcweb-security-arch
[4] http://w2spconf.com/2011/papers/websocket.pdf
[5] http://www.w3.org/2012/09/sysapps-wg-charter
[6]
https://www.bankofamerica.com/privacy/online-mobile-banking-privacy/sitekey.go
Received on Monday, 11 March 2013 20:54:49 UTC