Re: Fetch, MSE, and MIX

I agree this sounds like a reasonable, incremental step, that is
technically sound, does not introduce new incentives against improving
security, and is consistent with other web platform work and direction.

-Brad Hill

On Fri Feb 20 2015 at 9:55:39 AM Aaron Colwell <acolwell@google.com> wrote:

> Hi Ryan,
>
> Thanks for writing this up. I know you already know this, but I wanted to
> publically declare my support as one of the MSE editors. While I wish we
> didn't need this, I can understand the concerns of content providers and I
> think this is a reasonable compromise.
>
> Aaron
>
> On Thu Feb 19 2015 at 9:06:17 PM Ryan Sleevi <sleevi@google.com> wrote:
>
>> Cross-posting, as this touches on the Fetch [1] spec, Media Source
>> Extensions [2], and Mixed Content [3]. This does cross-post WHATWG and
>> W3C, apologies if this is a mortal sin.
>>
>> TL;DR Proposal first:
>> - Amend MIX in [4] to add "fetch" as an optionally-blockable-request-
>> context
>>   * This means that fetch() can now return HTTP content from HTTPS
>> pages. The implications of this, however, are described below, if you
>> can handle reading it all.
>> - Amend MSE in [5] to introduce a new method, appendResponse(Response
>> response), which accepts a Response [6] class
>> - In MSE, define a Response Append Loop similar to the Stream Append
>> Loop [7], that calls the consume body algorithm [8] on the internal
>> response [9] of Response to yield an ArrayBuffer, then executes the
>> buffer append [10] algorithm on the SourceBuffer
>>
>>
>> MUCH longer justification why:
>> As it stands, <audio>/<video>/<source> tags today are optionally
>> blockable content, as noted in [4]. Thus, an HTTPS page may set the
>> source to HTTP content and load the content (although typically with
>> user-agent indication). MSE poses itself as a spec to offer much
>> greater control to site authors than <audio>/<video>, as noted in its
>> use cases, and as a result, has seen a rapid adoption among a number
>> of popular video streaming sites. Most notably, the ability to do
>> adaptive streaming with MSE helps provide a better quality, better
>> performing experience for users. Finally, in some user agents, MSE is
>> a pre-requisite for the use of Encrypted Media Extensions [11].
>>
>> However, there are limitations to using MSE that don't exist with
>> <video>/<audio>. The most notable of these is that in order to
>> implement the adaptive streaming capabilities, most sites make use of
>> XMLHttpRequest to request portions of media content, which can then be
>> supplied to the SourceBuffer. Based on the feedback that MSE provides
>> the script author, it can then adjust the XHRs they make to use a
>> lower bitrate media source, to drop segments, etc. When using XHR, the
>> site author loses the ability to mix HTTPS pages with HTTP media, as
>> XHR is (rightfully so) treated as blocked content.
>>
>> The justification for why XHR does this is that it returns the full
>> buffer to the page author. In practice, we saw many sites then taking
>> that buffer and making security decisions on it - whether it be
>> "clearly" bad things such as eval()ing the content to more subtle
>> things like adjusting UI or links. All of these undermine all of the
>> security guarantees that HTTPS tries to provide, and thus XHR is
>> blocked.
>>
>> The result is that if an HTTPS site wants to use MSE with XHR, all of
>> the content needs to be served via HTTPS. We've already seen some
>> providers complain that this is prohibitively expensive in their
>> current networks [12], although it may be solvable in time, as
>> demonstrated by other video sharing sites [13].
>>
>> In a choice between using MSE - which offers a better user experience
>> over <video>/<audio> by reducing bandwidth and improving quality - and
>> using HTTPS - which offers better privacy and security controls -
>> sites are likely to choose solutions that reduce their costs rather
>> than protect their users, a reasonable but unfortunate business
>> reality.
>>
>> I'm hoping to find a way to close that gap - to allow sites to use MSE
>> (and potentially EME) via HTTPS documents, while still sourcing their
>> media content via HTTP. This may seem counter-intuitive, and a step
>> back from the efforts of the Chrome security team, but I think it is
>> actually consistent with our goals and our past comments. In
>> particular, this solution tries to provide a means and incentive for
>> sites to adopt MSE (improving user experience) AND to begin migrating
>> to HTTPS; first with their main document, and then, in time, all of
>> their media content.
>>
>> This won't protect adversaries from knowing what content the user is
>> actively watching, for example, but will help protect other vital
>> assets - such as their cookies, session identifiers, user information,
>> friends list, past viewing history, etc.
>>
>> Allowing fetch() to return HTTP content sourced from HTTPS pages seems
>> like it would re-open the XHR hole, but this isn't the case. As
>> described in [14], all requests whose mode is CORS or
>> CORS-with-forced-preflight are force-failed. This only leaves the
>> request modes of "no-cors", "same-origin", "about"and "data". Because
>> the origins are different between the document (https) and the request
>> URL (http), the request mode will be "no-cors", and thus the returned
>> Response object will be set to "opaque".
>>
>> The "opaque" response prevents direct access to the Response data.
>> Similarly, the SourceBuffer object does not allow direct access to the
>> data - this is only passed on to the audio/video decoders, same as the
>> existing <audio>/<video>/<source> tags today. I realize this may
>> prevent access to the full capabilities of MSE; indeed, some use cases
>> require access to the content in order to do adaptive streaming.
>> However, there still seem a number of use cases where it can work, or
>> where existing solutions that do require direct access to content may
>> be adjusted, slightly, so that they don't.
>>
>> In discussing this, internally and with other vendors, the primary
>> security implication of this is that of privacy leakage. However, this
>> problem exists regardless of fetch(), due to the fact that script can
>> always inject any of the optionally-blockable content tags into the
>> page and leak information. That is, I can always disclose content by
>> using a <video> or <img> tag directly, and I can always smuggle back a
>> few bits of information at a time (for example, using the width/height
>> of the image to smuggle back 4-8 bytes at a time, or, even more
>> primitively, using onload/onerror to smuggle a bit at a time back)
>>
>> Further, I'm not proposing that there be any special UI handling for
>> these mixed-content fetch()'s - that is, they behave as the user agent
>> already does when encountering passive mixed content (e.g. some form
>> of UI warning/degradation). So performing these fetch()'s will NOT
>> yield positive security indicators. Of course, as proposals like [15]
>> mature, it may be far more desirable sites to have HTTPS with mixed
>> content compared to HTTP, thus making this proposal even more
>> attractive than the HTTP counterpart.
>>
>> Overall, the hope is to provide incentives for media sharing sites to
>> begin migrating to HTTPS, allowing them to keep the existing features
>> they have over HTTP (in this case, MSE), and potentially allowing for
>> a migration path that allows the staged deprecation of allowing more
>> powerful, privacy-sensitive features like EME [16] from being
>> available over HTTP, while not taking any steps backwards in terms of
>> privacy or security for fetch() or HTTPS pages.
>>
>> This is not meant to be a long-term solution for optionally-blockable
>> content. I absolutely think we should be working to wean sites off
>> this and move them away. However, in the trade-off between having
>> major sites using HTTP or having to prolong optionally-blockable
>> content for some additional, defined period of time, I absolutely
>> believe the latter is in the greater interest of web security, and
>> consistent  with the findings of the W3C's TAG.
>>
>> So, beyond telling me I wrote way too much, what do people think?
>>
>> [1] https://fetch.spec.whatwg.org/
>> [2] http://w3c.github.io/media-source/
>> [3] https://w3c.github.io/webappsec/specs/mixedcontent/
>> [4] https://w3c.github.io/webappsec/specs/mixedcontent/#
>> category-optionally-blockable
>> [5] http://w3c.github.io/media-source/#sourcebuffer
>> [6] https://fetch.spec.whatwg.org/#response-class
>> [7] http://w3c.github.io/media-source/#sourcebuffer-stream-append-loop
>> [8] https://fetch.spec.whatwg.org/#body
>> [9] https://fetch.spec.whatwg.org/#concept-internal-response
>> [10] http://w3c.github.io/media-source/#sourcebuffer-buffer-append
>> [11] https://w3c.github.io/encrypted-media/
>> [12] https://lists.w3.org/Archives/Public/www-tag/2014Oct/0105.html
>> [13] https://www.youtube.com/
>> [14] https://w3c.github.io/webappsec/specs/mixedcontent/#
>> should-block-fetch
>> [15] https://lists.w3.org/Archives/Public/public-webappsec/
>> 2014Dec/0062.html
>> [16] https://www.w3.org/Bugs/Public/show_bug.cgi?id=26332
>> [17] https://w3ctag.github.io/web-https/
>>
>

Received on Friday, 20 February 2015 18:36:58 UTC