Re: Fetch, MSE, and MIX from Matthew Wolenetz on 2015-04-10 (public-html-media@w3.org from April 2015)

From: Matthew Wolenetz <wolenetz@google.com>
Date: Fri, 10 Apr 2015 15:24:15 -0700
To: Brad Hill <hillbrad@gmail.com>
Cc: Aaron Colwell <acolwell@google.com>, Ryan Sleevi <sleevi@google.com>, whatwg@whatwg.org, "public-webappsec@w3.org" <public-webappsec@w3.org>, public-html-media@w3.org
Message-ID: <CAADho6NVfEdSJ6RGYmnX0MNk2fyeWfB_dROt5ZdfFe7+wNASkg@mail.gmail.com>
As one of the current MSE editors, I too wanted to publicly declare my
support for Ryan's proposal, as well as provide a few practical updates to
the proposal that minimize the impact on the MSE spec.

Mixed content for MSE allows secure origins to source MSE media streams
from insecure origins. This better aligns MSE with the existing Mixed
Content behavior for video.src= (Optionally Blockable Content).


After further internal discussion, we believe we can and should reuse
appendStream() rather than adding appendResponse(). The solution is similar
to what Ryan described except that an opaque Stream would be obtained from
the opaque Response. Other than completing the specification of
appendStream() (see https://www.w3.org/Bugs/Public/show_bug.cgi?id=27239#c2),
there should be little impact on MSE other than perhaps language related to
opaque streams, if necessary. Details below.










*Streams API - Define an opaque Stream with properties like Ryan described
in the original email.Fetch - Implement support for opaque Responses as
Ryan described in the original email.- Expose Streams, including opaque
ones, from the Response object.Mixed Content spec - Include opaque
Responses and/or Streams as appropriate.*



On Fri, Feb 20, 2015 at 10:36 AM, Brad Hill <hillbrad@gmail.com> wrote:

> I agree this sounds like a reasonable, incremental step, that is
> technically sound, does not introduce new incentives against improving
> security, and is consistent with other web platform work and direction.
>
> -Brad Hill
>
> On Fri Feb 20 2015 at 9:55:39 AM Aaron Colwell <acolwell@google.com>
> wrote:
>
>> Hi Ryan,
>>
>> Thanks for writing this up. I know you already know this, but I wanted to
>> publically declare my support as one of the MSE editors. While I wish we
>> didn't need this, I can understand the concerns of content providers and I
>> think this is a reasonable compromise.
>>
>> Aaron
>>
>> On Thu Feb 19 2015 at 9:06:17 PM Ryan Sleevi <sleevi@google.com> wrote:
>>
>>> Cross-posting, as this touches on the Fetch [1] spec, Media Source
>>> Extensions [2], and Mixed Content [3]. This does cross-post WHATWG and
>>> W3C, apologies if this is a mortal sin.
>>>
>>> TL;DR Proposal first:
>>> - Amend MIX in [4] to add "fetch" as an optionally-blockable-request-
>>> context
>>>   * This means that fetch() can now return HTTP content from HTTPS
>>> pages. The implications of this, however, are described below, if you
>>> can handle reading it all.
>>> - Amend MSE in [5] to introduce a new method, appendResponse(Response
>>> response), which accepts a Response [6] class
>>> - In MSE, define a Response Append Loop similar to the Stream Append
>>> Loop [7], that calls the consume body algorithm [8] on the internal
>>> response [9] of Response to yield an ArrayBuffer, then executes the
>>> buffer append [10] algorithm on the SourceBuffer
>>>
>>>
>>> MUCH longer justification why:
>>> As it stands, <audio>/<video>/<source> tags today are optionally
>>> blockable content, as noted in [4]. Thus, an HTTPS page may set the
>>> source to HTTP content and load the content (although typically with
>>> user-agent indication). MSE poses itself as a spec to offer much
>>> greater control to site authors than <audio>/<video>, as noted in its
>>> use cases, and as a result, has seen a rapid adoption among a number
>>> of popular video streaming sites. Most notably, the ability to do
>>> adaptive streaming with MSE helps provide a better quality, better
>>> performing experience for users. Finally, in some user agents, MSE is
>>> a pre-requisite for the use of Encrypted Media Extensions [11].
>>>
>>> However, there are limitations to using MSE that don't exist with
>>> <video>/<audio>. The most notable of these is that in order to
>>> implement the adaptive streaming capabilities, most sites make use of
>>> XMLHttpRequest to request portions of media content, which can then be
>>> supplied to the SourceBuffer. Based on the feedback that MSE provides
>>> the script author, it can then adjust the XHRs they make to use a
>>> lower bitrate media source, to drop segments, etc. When using XHR, the
>>> site author loses the ability to mix HTTPS pages with HTTP media, as
>>> XHR is (rightfully so) treated as blocked content.
>>>
>>> The justification for why XHR does this is that it returns the full
>>> buffer to the page author. In practice, we saw many sites then taking
>>> that buffer and making security decisions on it - whether it be
>>> "clearly" bad things such as eval()ing the content to more subtle
>>> things like adjusting UI or links. All of these undermine all of the
>>> security guarantees that HTTPS tries to provide, and thus XHR is
>>> blocked.
>>>
>>> The result is that if an HTTPS site wants to use MSE with XHR, all of
>>> the content needs to be served via HTTPS. We've already seen some
>>> providers complain that this is prohibitively expensive in their
>>> current networks [12], although it may be solvable in time, as
>>> demonstrated by other video sharing sites [13].
>>>
>>> In a choice between using MSE - which offers a better user experience
>>> over <video>/<audio> by reducing bandwidth and improving quality - and
>>> using HTTPS - which offers better privacy and security controls -
>>> sites are likely to choose solutions that reduce their costs rather
>>> than protect their users, a reasonable but unfortunate business
>>> reality.
>>>
>>> I'm hoping to find a way to close that gap - to allow sites to use MSE
>>> (and potentially EME) via HTTPS documents, while still sourcing their
>>> media content via HTTP. This may seem counter-intuitive, and a step
>>> back from the efforts of the Chrome security team, but I think it is
>>> actually consistent with our goals and our past comments. In
>>> particular, this solution tries to provide a means and incentive for
>>> sites to adopt MSE (improving user experience) AND to begin migrating
>>> to HTTPS; first with their main document, and then, in time, all of
>>> their media content.
>>>
>>> This won't protect adversaries from knowing what content the user is
>>> actively watching, for example, but will help protect other vital
>>> assets - such as their cookies, session identifiers, user information,
>>> friends list, past viewing history, etc.
>>>
>>> Allowing fetch() to return HTTP content sourced from HTTPS pages seems
>>> like it would re-open the XHR hole, but this isn't the case. As
>>> described in [14], all requests whose mode is CORS or
>>> CORS-with-forced-preflight are force-failed. This only leaves the
>>> request modes of "no-cors", "same-origin", "about"and "data". Because
>>> the origins are different between the document (https) and the request
>>> URL (http), the request mode will be "no-cors", and thus the returned
>>> Response object will be set to "opaque".
>>>
>>> The "opaque" response prevents direct access to the Response data.
>>> Similarly, the SourceBuffer object does not allow direct access to the
>>> data - this is only passed on to the audio/video decoders, same as the
>>> existing <audio>/<video>/<source> tags today. I realize this may
>>> prevent access to the full capabilities of MSE; indeed, some use cases
>>> require access to the content in order to do adaptive streaming.
>>> However, there still seem a number of use cases where it can work, or
>>> where existing solutions that do require direct access to content may
>>> be adjusted, slightly, so that they don't.
>>>
>>> In discussing this, internally and with other vendors, the primary
>>> security implication of this is that of privacy leakage. However, this
>>> problem exists regardless of fetch(), due to the fact that script can
>>> always inject any of the optionally-blockable content tags into the
>>> page and leak information. That is, I can always disclose content by
>>> using a <video> or <img> tag directly, and I can always smuggle back a
>>> few bits of information at a time (for example, using the width/height
>>> of the image to smuggle back 4-8 bytes at a time, or, even more
>>> primitively, using onload/onerror to smuggle a bit at a time back)
>>>
>>> Further, I'm not proposing that there be any special UI handling for
>>> these mixed-content fetch()'s - that is, they behave as the user agent
>>> already does when encountering passive mixed content (e.g. some form
>>> of UI warning/degradation). So performing these fetch()'s will NOT
>>> yield positive security indicators. Of course, as proposals like [15]
>>> mature, it may be far more desirable sites to have HTTPS with mixed
>>> content compared to HTTP, thus making this proposal even more
>>> attractive than the HTTP counterpart.
>>>
>>> Overall, the hope is to provide incentives for media sharing sites to
>>> begin migrating to HTTPS, allowing them to keep the existing features
>>> they have over HTTP (in this case, MSE), and potentially allowing for
>>> a migration path that allows the staged deprecation of allowing more
>>> powerful, privacy-sensitive features like EME [16] from being
>>> available over HTTP, while not taking any steps backwards in terms of
>>> privacy or security for fetch() or HTTPS pages.
>>>
>>> This is not meant to be a long-term solution for optionally-blockable
>>> content. I absolutely think we should be working to wean sites off
>>> this and move them away. However, in the trade-off between having
>>> major sites using HTTP or having to prolong optionally-blockable
>>> content for some additional, defined period of time, I absolutely
>>> believe the latter is in the greater interest of web security, and
>>> consistent  with the findings of the W3C's TAG.
>>>
>>> So, beyond telling me I wrote way too much, what do people think?
>>>
>>> [1] https://fetch.spec.whatwg.org/
>>> [2] http://w3c.github.io/media-source/
>>> [3] https://w3c.github.io/webappsec/specs/mixedcontent/
>>> [4] https://w3c.github.io/webappsec/specs/mixedcontent/#
>>> category-optionally-blockable
>>> [5] http://w3c.github.io/media-source/#sourcebuffer
>>> [6] https://fetch.spec.whatwg.org/#response-class
>>> [7] http://w3c.github.io/media-source/#sourcebuffer-stream-append-loop
>>> [8] https://fetch.spec.whatwg.org/#body
>>> [9] https://fetch.spec.whatwg.org/#concept-internal-response
>>> [10] http://w3c.github.io/media-source/#sourcebuffer-buffer-append
>>> [11] https://w3c.github.io/encrypted-media/
>>> [12] https://lists.w3.org/Archives/Public/www-tag/2014Oct/0105.html
>>> [13] https://www.youtube.com/
>>> [14] https://w3c.github.io/webappsec/specs/mixedcontent/#
>>> should-block-fetch
>>> [15] https://lists.w3.org/Archives/Public/public-webappsec/
>>> 2014Dec/0062.html
>>> [16] https://www.w3.org/Bugs/Public/show_bug.cgi?id=26332
>>> [17] https://w3ctag.github.io/web-https/
>>>
>>
Received on Friday, 10 April 2015 22:25:03 UTC