Fetch, MSE, and MIX from Ryan Sleevi on 2015-02-20 (public-html-media@w3.org from February 2015)

From: Ryan Sleevi <sleevi@google.com>
Date: Thu, 19 Feb 2015 21:06:17 -0800
To: whatwg@whatwg.org
Cc: "public-webappsec@w3.org" <public-webappsec@w3.org>, public-html-media@w3.org
Message-ID: <CACvaWvZ+b9+fU2Af9y=bUzOFv6kYmhVK3GYbdSweV8ZPZ9YoRA@mail.gmail.com>
Cross-posting, as this touches on the Fetch [1] spec, Media Source
Extensions [2], and Mixed Content [3]. This does cross-post WHATWG and
W3C, apologies if this is a mortal sin.

TL;DR Proposal first:
- Amend MIX in [4] to add "fetch" as an optionally-blockable-request-context
  * This means that fetch() can now return HTTP content from HTTPS
pages. The implications of this, however, are described below, if you
can handle reading it all.
- Amend MSE in [5] to introduce a new method, appendResponse(Response
response), which accepts a Response [6] class
- In MSE, define a Response Append Loop similar to the Stream Append
Loop [7], that calls the consume body algorithm [8] on the internal
response [9] of Response to yield an ArrayBuffer, then executes the
buffer append [10] algorithm on the SourceBuffer


MUCH longer justification why:
As it stands, <audio>/<video>/<source> tags today are optionally
blockable content, as noted in [4]. Thus, an HTTPS page may set the
source to HTTP content and load the content (although typically with
user-agent indication). MSE poses itself as a spec to offer much
greater control to site authors than <audio>/<video>, as noted in its
use cases, and as a result, has seen a rapid adoption among a number
of popular video streaming sites. Most notably, the ability to do
adaptive streaming with MSE helps provide a better quality, better
performing experience for users. Finally, in some user agents, MSE is
a pre-requisite for the use of Encrypted Media Extensions [11].

However, there are limitations to using MSE that don't exist with
<video>/<audio>. The most notable of these is that in order to
implement the adaptive streaming capabilities, most sites make use of
XMLHttpRequest to request portions of media content, which can then be
supplied to the SourceBuffer. Based on the feedback that MSE provides
the script author, it can then adjust the XHRs they make to use a
lower bitrate media source, to drop segments, etc. When using XHR, the
site author loses the ability to mix HTTPS pages with HTTP media, as
XHR is (rightfully so) treated as blocked content.

The justification for why XHR does this is that it returns the full
buffer to the page author. In practice, we saw many sites then taking
that buffer and making security decisions on it - whether it be
"clearly" bad things such as eval()ing the content to more subtle
things like adjusting UI or links. All of these undermine all of the
security guarantees that HTTPS tries to provide, and thus XHR is
blocked.

The result is that if an HTTPS site wants to use MSE with XHR, all of
the content needs to be served via HTTPS. We've already seen some
providers complain that this is prohibitively expensive in their
current networks [12], although it may be solvable in time, as
demonstrated by other video sharing sites [13].

In a choice between using MSE - which offers a better user experience
over <video>/<audio> by reducing bandwidth and improving quality - and
using HTTPS - which offers better privacy and security controls -
sites are likely to choose solutions that reduce their costs rather
than protect their users, a reasonable but unfortunate business
reality.

I'm hoping to find a way to close that gap - to allow sites to use MSE
(and potentially EME) via HTTPS documents, while still sourcing their
media content via HTTP. This may seem counter-intuitive, and a step
back from the efforts of the Chrome security team, but I think it is
actually consistent with our goals and our past comments. In
particular, this solution tries to provide a means and incentive for
sites to adopt MSE (improving user experience) AND to begin migrating
to HTTPS; first with their main document, and then, in time, all of
their media content.

This won't protect adversaries from knowing what content the user is
actively watching, for example, but will help protect other vital
assets - such as their cookies, session identifiers, user information,
friends list, past viewing history, etc.

Allowing fetch() to return HTTP content sourced from HTTPS pages seems
like it would re-open the XHR hole, but this isn't the case. As
described in [14], all requests whose mode is CORS or
CORS-with-forced-preflight are force-failed. This only leaves the
request modes of "no-cors", "same-origin", "about"and "data". Because
the origins are different between the document (https) and the request
URL (http), the request mode will be "no-cors", and thus the returned
Response object will be set to "opaque".

The "opaque" response prevents direct access to the Response data.
Similarly, the SourceBuffer object does not allow direct access to the
data - this is only passed on to the audio/video decoders, same as the
existing <audio>/<video>/<source> tags today. I realize this may
prevent access to the full capabilities of MSE; indeed, some use cases
require access to the content in order to do adaptive streaming.
However, there still seem a number of use cases where it can work, or
where existing solutions that do require direct access to content may
be adjusted, slightly, so that they don't.

In discussing this, internally and with other vendors, the primary
security implication of this is that of privacy leakage. However, this
problem exists regardless of fetch(), due to the fact that script can
always inject any of the optionally-blockable content tags into the
page and leak information. That is, I can always disclose content by
using a <video> or <img> tag directly, and I can always smuggle back a
few bits of information at a time (for example, using the width/height
of the image to smuggle back 4-8 bytes at a time, or, even more
primitively, using onload/onerror to smuggle a bit at a time back)

Further, I'm not proposing that there be any special UI handling for
these mixed-content fetch()'s - that is, they behave as the user agent
already does when encountering passive mixed content (e.g. some form
of UI warning/degradation). So performing these fetch()'s will NOT
yield positive security indicators. Of course, as proposals like [15]
mature, it may be far more desirable sites to have HTTPS with mixed
content compared to HTTP, thus making this proposal even more
attractive than the HTTP counterpart.

Overall, the hope is to provide incentives for media sharing sites to
begin migrating to HTTPS, allowing them to keep the existing features
they have over HTTP (in this case, MSE), and potentially allowing for
a migration path that allows the staged deprecation of allowing more
powerful, privacy-sensitive features like EME [16] from being
available over HTTP, while not taking any steps backwards in terms of
privacy or security for fetch() or HTTPS pages.

This is not meant to be a long-term solution for optionally-blockable
content. I absolutely think we should be working to wean sites off
this and move them away. However, in the trade-off between having
major sites using HTTP or having to prolong optionally-blockable
content for some additional, defined period of time, I absolutely
believe the latter is in the greater interest of web security, and
consistent  with the findings of the W3C's TAG.

So, beyond telling me I wrote way too much, what do people think?

[1] https://fetch.spec.whatwg.org/
[2] http://w3c.github.io/media-source/
[3] https://w3c.github.io/webappsec/specs/mixedcontent/
[4] https://w3c.github.io/webappsec/specs/mixedcontent/#category-optionally-blockable
[5] http://w3c.github.io/media-source/#sourcebuffer
[6] https://fetch.spec.whatwg.org/#response-class
[7] http://w3c.github.io/media-source/#sourcebuffer-stream-append-loop
[8] https://fetch.spec.whatwg.org/#body
[9] https://fetch.spec.whatwg.org/#concept-internal-response
[10] http://w3c.github.io/media-source/#sourcebuffer-buffer-append
[11] https://w3c.github.io/encrypted-media/
[12] https://lists.w3.org/Archives/Public/www-tag/2014Oct/0105.html
[13] https://www.youtube.com/
[14] https://w3c.github.io/webappsec/specs/mixedcontent/#should-block-fetch
[15] https://lists.w3.org/Archives/Public/public-webappsec/2014Dec/0062.html
[16] https://www.w3.org/Bugs/Public/show_bug.cgi?id=26332
[17] https://w3ctag.github.io/web-https/
Received on Friday, 20 February 2015 05:06:45 UTC