Re: HTTP/2 Server Push and solid compression

Hi Alan,

I absolutely agree with your premise. I think you've identified a good
example of the kinds of models which are not well supported by existing
options for compression in HTTP (which may contribute to explaining why
unbundling resources into server pushes has not seen very wide adoption).

Finding a solution to this problem is extremely desirable, and a number
of attempts have been made to do so. That is, to allow individual
responses to access a shared or external compression context, and
thereby achieve good compression for individually small responses.

For examples, see Compression Dictionaries for HTTP/2 [1], and Shared
Dictionary Compression over HTTP [2]. In extremely broad strokes, your
two proposals are similar (respectively) to those two drafts.

As Patrick mentions, these solutions have largely fallen victim in the
past to security concerns: mixing different types of data into the same
compression window can provide attackers the ability to exfiltrate
private data by observing overall compression effectiveness (i.e.,
CRIME/BREACH/HEIST). This is already a problem in the existing world of
HTTP compression, when applications allow attacker-controlled and
private user data to intermingle in a single response. Extending that,
by allowing interactions between different responses, would be to throw
gasoline on that fire.

I have been working to make another attempt at addressing these problems
(both the narrow one of resolving the security questions and the broad
one of building a solution overall).

I am working on a draft [3] that discusses the security concerns in this
space, which will hopefully let us chart a path forward. I hope to have
something to circulate in advance of the next meeting in July.

Longer term, I am working on building and deploying a dictionary-based
compression scheme for HTTP (at Facebook initially, with an eye towards
eventual standardization).

Collaboration along either front would be very welcome!

- Felix

[1]
https://tools.ietf.org/html/draft-vkrasnov-h2-compression-dictionaries-03
[2] https://tools.ietf.org/html/draft-lee-sdch-spec-00
[3] https://tools.ietf.org/html/draft-kucherawy-httpbis-dict-sec-00

On 5/21/19 11:16 AM, Alan Egerton wrote:
> On Tue, May 21, 2019 at 3:33 PM Alan Egerton <eggyal@gmail.com> wrote:
>> I see two possible solutions:
>>
>> (1) standardise the bundle format in order that caches can separate
>> and store the underlying resources: plenty of hazards here—especially
>> since there will no longer be an HTTP response per resource, requiring
>> metadata (including cache control etc) to be encoded somehow else.  My
>> gut says this is probably a bad idea.
>>
>> (2) use a compression format that produces a separate output file for
>> each input file, yet still achieves better overall compression than
>> compressing the files individually: I imagine that this will produce
>> an additional output file that is common to/referenced by all the
>> compressed files being returned by that single operation;
>> decompression of any of the transmitted resources would be achieved
>> using only the common file and the resource-specific file as input.
> 
> Just following my own thoughts with an observation: in extremis, these
> two approaches can actually become analogous.
> 
> For example, a .tar.gz could serve as both the standardised "bundle"
> format (1) and the common output file (2) with the metadata
> transmitted in the form of separate HTTP responses (1) whose payloads
> reference the relevant constituent of that tarball (2).
> 
> I recognise that such an approach would also be a regression, because
> it defeats the benefits of HTTP/2's multiplexing (the constituents of
> the tarball only become available in sequence); therefore any solution
> of type (2) must balance the competing requirements to minimise both
> the "common file" and the overall size.  Perhaps there is no such
> balance that yields material benefit over the status quo.
> 
> -- Alan
> 

Received on Friday, 24 May 2019 07:54:46 UTC