Re: [whatwg] Fetch, MSE, and MIX from Aaron Colwell on 2015-02-20 (public-whatwg-archive@w3.org from February 2015)

From: Aaron Colwell <acolwell@google.com>
Date: Fri, 20 Feb 2015 17:53:25 +0000
To: Ryan Sleevi <sleevi@google.com>, whatwg@whatwg.org
Cc: "public-webappsec@w3.org" <public-webappsec@w3.org>, public-html-media@w3.org
Message-ID: <CAA0c1bCP9EUbxZK0pg6_9Ghp+oyiOvC_EL_OOHLCuv6Pt_PG+g@mail.gmail.com>
Hi Ryan,

Thanks for writing this up. I know you already know this, but I wanted to
publically declare my support as one of the MSE editors. While I wish we
didn't need this, I can understand the concerns of content providers and I
think this is a reasonable compromise.

Aaron

On Thu Feb 19 2015 at 9:06:17 PM Ryan Sleevi <sleevi@google.com> wrote:

> Cross-posting, as this touches on the Fetch [1] spec, Media Source
> Extensions [2], and Mixed Content [3]. This does cross-post WHATWG and
> W3C, apologies if this is a mortal sin.
>
> TL;DR Proposal first:
> - Amend MIX in [4] to add "fetch" as an optionally-blockable-request-
> context
>   * This means that fetch() can now return HTTP content from HTTPS
> pages. The implications of this, however, are described below, if you
> can handle reading it all.
> - Amend MSE in [5] to introduce a new method, appendResponse(Response
> response), which accepts a Response [6] class
> - In MSE, define a Response Append Loop similar to the Stream Append
> Loop [7], that calls the consume body algorithm [8] on the internal
> response [9] of Response to yield an ArrayBuffer, then executes the
> buffer append [10] algorithm on the SourceBuffer
>
>
> MUCH longer justification why:
> As it stands, <audio>/<video>/<source> tags today are optionally
> blockable content, as noted in [4]. Thus, an HTTPS page may set the
> source to HTTP content and load the content (although typically with
> user-agent indication). MSE poses itself as a spec to offer much
> greater control to site authors than <audio>/<video>, as noted in its
> use cases, and as a result, has seen a rapid adoption among a number
> of popular video streaming sites. Most notably, the ability to do
> adaptive streaming with MSE helps provide a better quality, better
> performing experience for users. Finally, in some user agents, MSE is
> a pre-requisite for the use of Encrypted Media Extensions [11].
>
> However, there are limitations to using MSE that don't exist with
> <video>/<audio>. The most notable of these is that in order to
> implement the adaptive streaming capabilities, most sites make use of
> XMLHttpRequest to request portions of media content, which can then be
> supplied to the SourceBuffer. Based on the feedback that MSE provides
> the script author, it can then adjust the XHRs they make to use a
> lower bitrate media source, to drop segments, etc. When using XHR, the
> site author loses the ability to mix HTTPS pages with HTTP media, as
> XHR is (rightfully so) treated as blocked content.
>
> The justification for why XHR does this is that it returns the full
> buffer to the page author. In practice, we saw many sites then taking
> that buffer and making security decisions on it - whether it be
> "clearly" bad things such as eval()ing the content to more subtle
> things like adjusting UI or links. All of these undermine all of the
> security guarantees that HTTPS tries to provide, and thus XHR is
> blocked.
>
> The result is that if an HTTPS site wants to use MSE with XHR, all of
> the content needs to be served via HTTPS. We've already seen some
> providers complain that this is prohibitively expensive in their
> current networks [12], although it may be solvable in time, as
> demonstrated by other video sharing sites [13].
>
> In a choice between using MSE - which offers a better user experience
> over <video>/<audio> by reducing bandwidth and improving quality - and
> using HTTPS - which offers better privacy and security controls -
> sites are likely to choose solutions that reduce their costs rather
> than protect their users, a reasonable but unfortunate business
> reality.
>
> I'm hoping to find a way to close that gap - to allow sites to use MSE
> (and potentially EME) via HTTPS documents, while still sourcing their
> media content via HTTP. This may seem counter-intuitive, and a step
> back from the efforts of the Chrome security team, but I think it is
> actually consistent with our goals and our past comments. In
> particular, this solution tries to provide a means and incentive for
> sites to adopt MSE (improving user experience) AND to begin migrating
> to HTTPS; first with their main document, and then, in time, all of
> their media content.
>
> This won't protect adversaries from knowing what content the user is
> actively watching, for example, but will help protect other vital
> assets - such as their cookies, session identifiers, user information,
> friends list, past viewing history, etc.
>
> Allowing fetch() to return HTTP content sourced from HTTPS pages seems
> like it would re-open the XHR hole, but this isn't the case. As
> described in [14], all requests whose mode is CORS or
> CORS-with-forced-preflight are force-failed. This only leaves the
> request modes of "no-cors", "same-origin", "about"and "data". Because
> the origins are different between the document (https) and the request
> URL (http), the request mode will be "no-cors", and thus the returned
> Response object will be set to "opaque".
>
> The "opaque" response prevents direct access to the Response data.
> Similarly, the SourceBuffer object does not allow direct access to the
> data - this is only passed on to the audio/video decoders, same as the
> existing <audio>/<video>/<source> tags today. I realize this may
> prevent access to the full capabilities of MSE; indeed, some use cases
> require access to the content in order to do adaptive streaming.
> However, there still seem a number of use cases where it can work, or
> where existing solutions that do require direct access to content may
> be adjusted, slightly, so that they don't.
>
> In discussing this, internally and with other vendors, the primary
> security implication of this is that of privacy leakage. However, this
> problem exists regardless of fetch(), due to the fact that script can
> always inject any of the optionally-blockable content tags into the
> page and leak information. That is, I can always disclose content by
> using a <video> or <img> tag directly, and I can always smuggle back a
> few bits of information at a time (for example, using the width/height
> of the image to smuggle back 4-8 bytes at a time, or, even more
> primitively, using onload/onerror to smuggle a bit at a time back)
>
> Further, I'm not proposing that there be any special UI handling for
> these mixed-content fetch()'s - that is, they behave as the user agent
> already does when encountering passive mixed content (e.g. some form
> of UI warning/degradation). So performing these fetch()'s will NOT
> yield positive security indicators. Of course, as proposals like [15]
> mature, it may be far more desirable sites to have HTTPS with mixed
> content compared to HTTP, thus making this proposal even more
> attractive than the HTTP counterpart.
>
> Overall, the hope is to provide incentives for media sharing sites to
> begin migrating to HTTPS, allowing them to keep the existing features
> they have over HTTP (in this case, MSE), and potentially allowing for
> a migration path that allows the staged deprecation of allowing more
> powerful, privacy-sensitive features like EME [16] from being
> available over HTTP, while not taking any steps backwards in terms of
> privacy or security for fetch() or HTTPS pages.
>
> This is not meant to be a long-term solution for optionally-blockable
> content. I absolutely think we should be working to wean sites off
> this and move them away. However, in the trade-off between having
> major sites using HTTP or having to prolong optionally-blockable
> content for some additional, defined period of time, I absolutely
> believe the latter is in the greater interest of web security, and
> consistent  with the findings of the W3C's TAG.
>
> So, beyond telling me I wrote way too much, what do people think?
>
> [1] https://fetch.spec.whatwg.org/
> [2] http://w3c.github.io/media-source/
> [3] https://w3c.github.io/webappsec/specs/mixedcontent/
> [4] https://w3c.github.io/webappsec/specs/mixedcontent/#
> category-optionally-blockable
> [5] http://w3c.github.io/media-source/#sourcebuffer
> [6] https://fetch.spec.whatwg.org/#response-class
> [7] http://w3c.github.io/media-source/#sourcebuffer-stream-append-loop
> [8] https://fetch.spec.whatwg.org/#body
> [9] https://fetch.spec.whatwg.org/#concept-internal-response
> [10] http://w3c.github.io/media-source/#sourcebuffer-buffer-append
> [11] https://w3c.github.io/encrypted-media/
> [12] https://lists.w3.org/Archives/Public/www-tag/2014Oct/0105.html
> [13] https://www.youtube.com/
> [14] https://w3c.github.io/webappsec/specs/mixedcontent/#
> should-block-fetch
> [15] https://lists.w3.org/Archives/Public/public-webappsec/
> 2014Dec/0062.html
> [16] https://www.w3.org/Bugs/Public/show_bug.cgi?id=26332
> [17] https://w3ctag.github.io/web-https/
>
Received on Friday, 20 February 2015 17:53:57 UTC