Re: [SRI] Escaping mixed-content blocking for video distribution

On Wed, Nov 12, 2014 at 11:22 AM, Brad Hill <> wrote:

>  Mark,
>   There is work going on in the OAuth WG on authenticating HTTP requests:
>   Have you looked at this to see if it is suitable for your use case?

​I think that is about enabling the server to authenticate the request.
What I think we need is for the UA to verify that the request processed by
the server was the same as the one it sent, so that the ​
​UA can be sure the traffic is not subject to attacks such as the Verizon

>   I think we would definitely like to continue the discussion on SRI for
> insecure origins, and on methods like unbalanced Merkle Tree hashing to
> apply integrity to streamed data, but the consensus seems to be strong that
> these should be "Level >= 2" features, and the discussion should be
> informed by the results of experimenting with the minimum-viable set of
> features currently proposed for Level 1.

​I'm offering no opinion on standardization schedule, just asking for
opinions on whether this is worth working on.


>  -Brad
>   From: Mark Watson <>
> Date: Wednesday, November 12, 2014 at 9:22 AM
> To: Adam Langley <>
> Cc: Mike West <>, Frederik Braun <>, "
>" <>
> Subject: Re: [SRI] Escaping mixed-content blocking for video distribution
> Resent-From: <>
> Resent-Date: Wednesday, November 12, 2014 at 9:23 AM
>    All,
>  Are there any further thoughts on this ? Again, a solution here offers
> the prospect of making it much easier / quicker for video distribution
> sites to migrate to secure origins, with the associated user benefits.
>  The proposal, such as it is, is to add a request integrity mechanism to
> the existing SRI mechanism (see below for a video-specific version of SRI).
> A new Request-Integrity HTTP header would be included in HTTP response
> giving an HMAC of the entire request as received by the server. The key
> used for this HMAC would be provided to the UA in the same way as the hash
> used for SRI. It would be a matter for the site to arrange for this key to
> be shared between client and server. The UA would verify this HMAC as well
> as the resource hash. If both pass, the resource is allowed as mixed
> content. If not, the resource is requested over HTTPS instead.
>  Request integrity is likely broken by many middleboxes which modify HTTP
> headers. In these cases we use HTTPS instead. Nevertheless, the fraction of
> video traffic that needed to use HTTPS would likely be small (and would
> diminish, since the middleboxes in this case would be serving no useful
> purpose whatsoever.)
>  There is likely an additional step needed in the request integrity
> mechanism to make it secure. If the UA was involved in choosing the key,
> the UA could ensure that random keys are used, rather than a fixed key for
> a given site, for example. I'm not proposing a finished design, but I think
> the details could be worked out.
>  Regarding a video-specific alternative to SRI as suggested by AGL: The
> existing SRI, when used with XHR, means that media data it not available
> for playback until the request is complete. Also, the mechanism applies to
> all data, not just audio/video data. A mechanism restricted to audio/video
> would be sufficient to achieve the goals here and would reduce the attack
> surface somewhat (just as audio/video fetched directly by the video element
> sometimes escapes mixed-content blocking).
>  The idea would be to allow XHR to HTTP resources, but only for the
> Stream return type and then returning a special kind of Stream which works
> only with the Media Source Extension. Furthermore, the Media Source
> Extension would expect such a Stream to contain embedded integrity data and
> for the page to provide block-by-block integrity information. Specifically,
> for mp4, we would provide a new data structure in the Movie Fragment header
> giving hashes of the movie data (this kind of thing has been discussed a
> few times in the past). The page would be expected to provide the hash of
> this new data structure for each Movie Fragment.
>  I think the generic mechanism is sufficient and much simpler, but the
> video-specific option restricts the attack surface. However the idea of a
> "special kind" of Stream object is a little clunky.
>  From a complexity / standardization perspective I can see the attraction
> of simply saying that people should use HTTPS, but that is likely going to
> take much longer in practice. In the meantime, the absence of a
> mixed-content solution for video distribution means there will continue to
> be pressure to keep various APIs desired by those sites open to insecure
> origins.
>  ...Mark
> On Tue, Nov 4, 2014 at 6:18 PM, Mark Watson <> wrote:
>> On Tue, Nov 4, 2014 at 5:58 PM, Adam Langley <> wrote:
>>> On Tue, Nov 4, 2014 at 5:46 PM, Mark Watson <> wrote:
>>> > I assumed the script was going to provide the hashes, since the content
>>> > would be coming over HTTP.
>>> That's a simple solution, but it wasn't what I had in mind at the time.
>>> Consider an HD movie that's 10GiB in size. Chunks of data cannot be
>>> processed before they have been verified and we don't want to add too
>>> much verification latency. So let's posit that 16KiB chunks are used.
>>   ​Let's say the movie is 2 hours long. Typically, adaptive streaming
>> downloads data in small chunks, say 2s in duration. It would be reasonable
>> to hash each such chunk, so there would be 3600 hashes = 112KB (I don't see
>> any reason to base64 them, they can be pulled from wherever they come with
>> XHR).
>>  A 2s chunk of video, in this example, is 2.8MB, so this overhead is
>> small (obviously it's bigger for lower bitrates).
>>  It's true that having to wait for a complete chunk to download before
>> playback will affect the quality of experience. There will be corner cases
>> where playback could have continued if a chunk could be played before
>> verification, but it will stall waiting for the completion of the chunk.
>>> If all the hashes for those chunks were sent upfront in the HTML then
>>> there are 10 * 2^^30 / 2^^14 chunks * 32 bytes per hash * 4/3 base64
>>> expansion = ~27MB of hashes to send the client before anything else.
>>> With the Merkle tree construction, the hash data can be interleaved in
>>> the 10GiB stream such that they are only downloaded as needed. The
>>> downside is that you either need a server capable of doing the
>>> interleaving dynamically, or you need two copies of the data on disk:
>>> one with interleaved hashes and one without. (Unless the data format
>>> is sufficiently forgiving that you can get away with serving the
>>> interleaved version to clients that aren't doing SRI processing.)
>>   Ok, so I guess this could solve the above problem, by having the site
>> provide a hash for each 2s chunk, say, where this hash is actually the hash
>> of the concatenation of the hashes for smaller pieces of that chunk and
>> these hashes are embedded in the file *before* the chunk they pertain to.
>> That way data could be fed to the video decoder closer to when it arrives.
>>  At least in the fragmented mp4 file case there are plenty of ways to
>> embed stuff that will be ignored by existing clients that don't understand
>> it.
>>  At present we are using the Media Source Extensions, with the data
>> being retrieved into an ArrayBuffer with XHR and this ArrayBuffer being fed
>> into the Media Source. The XHR does not know the data is for media
>> playback, so it couldn't do the above.
>>  However, we are discussing how to integrate with Streams, so that a
>> Stream obtained from the XHR would be connected directly to the Media
>> Source. I guess in this case there could be some media-specific integrity
>> checking on the Media Source side that allows this otherwise "untrusted"
>> XHR data to be used. In this case the data would never be exposed to JS.
>>  ...Mark​
>>> Cheers
>>> AGL

Received on Wednesday, 12 November 2014 19:44:50 UTC