Re: [SRI] Escaping mixed-content blocking for video distribution from Mark Watson on 2014-11-12 (public-webappsec@w3.org from November 2014)

From: Mark Watson <watsonm@netflix.com>
Date: Wed, 12 Nov 2014 11:44:22 -0800
To: Brad Hill <hillbrad@fb.com>
Cc: Adam Langley <agl@google.com>, Mike West <mkwst@google.com>, Frederik Braun <fbraun@mozilla.com>, "public-webappsec@w3.org" <public-webappsec@w3.org>
Message-ID: <CAEnTvdDxoWYD74Tj9Mqq-Z_ne9kMN0FG2Y2cEfJ3UEJE8TFR6Q@mail.gmail.com>
On Wed, Nov 12, 2014 at 11:22 AM, Brad Hill <hillbrad@fb.com> wrote:

>  Mark,
>
>   There is work going on in the OAuth WG on authenticating HTTP requests:
>
>  http://tools.ietf.org/html/draft-ietf-oauth-signed-http-request-00
>
>   Have you looked at this to see if it is suitable for your use case?
>

I think that is about enabling the server to authenticate the request.
What I think we need is for the UA to verify that the request processed by
the server was the same as the one it sent, so that the 
UA can be sure the traffic is not subject to attacks such as the Verizon
"perma-cookie".


>
>   I think we would definitely like to continue the discussion on SRI for
> insecure origins, and on methods like unbalanced Merkle Tree hashing to
> apply integrity to streamed data, but the consensus seems to be strong that
> these should be "Level >= 2" features, and the discussion should be
> informed by the results of experimenting with the minimum-viable set of
> features currently proposed for Level 1.
>

I'm offering no opinion on standardization schedule, just asking for
opinions on whether this is worth working on.

...Mark



>
>  -Brad
>
>   From: Mark Watson <watsonm@netflix.com>
> Date: Wednesday, November 12, 2014 at 9:22 AM
> To: Adam Langley <agl@google.com>
> Cc: Mike West <mkwst@google.com>, Frederik Braun <fbraun@mozilla.com>, "
> public-webappsec@w3.org" <public-webappsec@w3.org>
> Subject: Re: [SRI] Escaping mixed-content blocking for video distribution
> Resent-From: <public-webappsec@w3.org>
> Resent-Date: Wednesday, November 12, 2014 at 9:23 AM
>
>    All,
>
>  Are there any further thoughts on this ? Again, a solution here offers
> the prospect of making it much easier / quicker for video distribution
> sites to migrate to secure origins, with the associated user benefits.
>
>  The proposal, such as it is, is to add a request integrity mechanism to
> the existing SRI mechanism (see below for a video-specific version of SRI).
> A new Request-Integrity HTTP header would be included in HTTP response
> giving an HMAC of the entire request as received by the server. The key
> used for this HMAC would be provided to the UA in the same way as the hash
> used for SRI. It would be a matter for the site to arrange for this key to
> be shared between client and server. The UA would verify this HMAC as well
> as the resource hash. If both pass, the resource is allowed as mixed
> content. If not, the resource is requested over HTTPS instead.
>
>  Request integrity is likely broken by many middleboxes which modify HTTP
> headers. In these cases we use HTTPS instead. Nevertheless, the fraction of
> video traffic that needed to use HTTPS would likely be small (and would
> diminish, since the middleboxes in this case would be serving no useful
> purpose whatsoever.)
>
>  There is likely an additional step needed in the request integrity
> mechanism to make it secure. If the UA was involved in choosing the key,
> the UA could ensure that random keys are used, rather than a fixed key for
> a given site, for example. I'm not proposing a finished design, but I think
> the details could be worked out.
>
>  Regarding a video-specific alternative to SRI as suggested by AGL: The
> existing SRI, when used with XHR, means that media data it not available
> for playback until the request is complete. Also, the mechanism applies to
> all data, not just audio/video data. A mechanism restricted to audio/video
> would be sufficient to achieve the goals here and would reduce the attack
> surface somewhat (just as audio/video fetched directly by the video element
> sometimes escapes mixed-content blocking).
>
>  The idea would be to allow XHR to HTTP resources, but only for the
> Stream return type and then returning a special kind of Stream which works
> only with the Media Source Extension. Furthermore, the Media Source
> Extension would expect such a Stream to contain embedded integrity data and
> for the page to provide block-by-block integrity information. Specifically,
> for mp4, we would provide a new data structure in the Movie Fragment header
> giving hashes of the movie data (this kind of thing has been discussed a
> few times in the past). The page would be expected to provide the hash of
> this new data structure for each Movie Fragment.
>
>  I think the generic mechanism is sufficient and much simpler, but the
> video-specific option restricts the attack surface. However the idea of a
> "special kind" of Stream object is a little clunky.
>
>  From a complexity / standardization perspective I can see the attraction
> of simply saying that people should use HTTPS, but that is likely going to
> take much longer in practice. In the meantime, the absence of a
> mixed-content solution for video distribution means there will continue to
> be pressure to keep various APIs desired by those sites open to insecure
> origins.
>
>  ...Mark
>
>
> On Tue, Nov 4, 2014 at 6:18 PM, Mark Watson <watsonm@netflix.com> wrote:
>
>>
>>
>> On Tue, Nov 4, 2014 at 5:58 PM, Adam Langley <agl@google.com> wrote:
>>
>>> On Tue, Nov 4, 2014 at 5:46 PM, Mark Watson <watsonm@netflix.com> wrote:
>>> > I assumed the script was going to provide the hashes, since the content
>>> > would be coming over HTTP.
>>>
>>> That's a simple solution, but it wasn't what I had in mind at the time.
>>>
>>> Consider an HD movie that's 10GiB in size. Chunks of data cannot be
>>> processed before they have been verified and we don't want to add too
>>> much verification latency. So let's posit that 16KiB chunks are used.
>>>
>>
>>   Let's say the movie is 2 hours long. Typically, adaptive streaming
>> downloads data in small chunks, say 2s in duration. It would be reasonable
>> to hash each such chunk, so there would be 3600 hashes = 112KB (I don't see
>> any reason to base64 them, they can be pulled from wherever they come with
>> XHR).
>>
>>  A 2s chunk of video, in this example, is 2.8MB, so this overhead is
>> small (obviously it's bigger for lower bitrates).
>>
>>  It's true that having to wait for a complete chunk to download before
>> playback will affect the quality of experience. There will be corner cases
>> where playback could have continued if a chunk could be played before
>> verification, but it will stall waiting for the completion of the chunk.
>>
>>
>>>
>>> If all the hashes for those chunks were sent upfront in the HTML then
>>> there are 10 * 2^^30 / 2^^14 chunks * 32 bytes per hash * 4/3 base64
>>> expansion = ~27MB of hashes to send the client before anything else.
>>>
>>> With the Merkle tree construction, the hash data can be interleaved in
>>> the 10GiB stream such that they are only downloaded as needed. The
>>> downside is that you either need a server capable of doing the
>>> interleaving dynamically, or you need two copies of the data on disk:
>>> one with interleaved hashes and one without. (Unless the data format
>>> is sufficiently forgiving that you can get away with serving the
>>> interleaved version to clients that aren't doing SRI processing.)
>>>
>>
>>   Ok, so I guess this could solve the above problem, by having the site
>> provide a hash for each 2s chunk, say, where this hash is actually the hash
>> of the concatenation of the hashes for smaller pieces of that chunk and
>> these hashes are embedded in the file *before* the chunk they pertain to.
>> That way data could be fed to the video decoder closer to when it arrives.
>>
>>  At least in the fragmented mp4 file case there are plenty of ways to
>> embed stuff that will be ignored by existing clients that don't understand
>> it.
>>
>>  At present we are using the Media Source Extensions, with the data
>> being retrieved into an ArrayBuffer with XHR and this ArrayBuffer being fed
>> into the Media Source. The XHR does not know the data is for media
>> playback, so it couldn't do the above.
>>
>>  However, we are discussing how to integrate with Streams, so that a
>> Stream obtained from the XHR would be connected directly to the Media
>> Source. I guess in this case there could be some media-specific integrity
>> checking on the Media Source side that allows this otherwise "untrusted"
>> XHR data to be used. In this case the data would never be exposed to JS.
>>
>>  ...Mark
>>
>>
>>
>>>
>>>
>>> Cheers
>>>
>>> AGL
>>>
>>
>>
>
Received on Wednesday, 12 November 2014 19:44:50 UTC