Re: Dictionary Compression for HTTP (at Facebook)

On Fri, Sep 21, 2018 at 6:55 PM Patrick McManus <mcmanus@ducksong.com>
wrote:

> Hi Felix,
>
> On Fri, Sep 21, 2018 at 5:31 PM, Felix Handte <felixh@fb.com> wrote:
>
>> Very well, I will attempt to grab the bull by the horns, then. Let's
>> talk security.
>>
>> I guess my first question is this: What is the acceptance criterion for
>> proposals in this space with respect to security? From my survey of
>>
>
> you are not going to be able to pre-negotiate working group acceptance
> criteria. The criteria is what it always is - rough consensus on a draft
> from the working group and the approval of the IESG.
>
> But to help with more background, the past concern that has been there
> hasn't been sufficient/proactive analysis of the various proposals - and
> given that the mixture of compression of encryption is known to be a
> problem (as you mention) a bar of "no known problems" hasn't been enough to
> get anywhere near rough consensus. I believe people wanted to see a
> proactive analysis of what the concerns of a particular proposal are. At
> that point we can debate whether they are reasonable or not for their
> anticipated gains.
>
> make sense? You're certainly going in a reasonable direction considering
> the interactions of dictionaries, what attackers control, and the ways in
> which public and private data are mixed. Of course confidentiality can
> apply to 'public data' as well and its not clear how/if folks would want to
> handle that.
>

Exactly this. To expand on this a bit more, there are other technical
considerations, such as:
- Are the dictionaries dynamically constructed based on the resources?
Meaning that, unlike gzip-et-al at an HTTP layer, and more like the
now-disabled TLS compression, there's inter-resource implications to the
security model. Does loading A then B reveal different information than
loading B then A? How should that be modeled?
- If dictionaries are static, how are the dictionaries determined? Are they
baked into the specification, or can they be self-declared?
- If the site self-declares/creates its own dictionaries, how will clients
receive these dictionaries? What are the protocol interactions there as it
relates to both 'classic' HTTP cases (for example, a libcurl utility or a
simple proxy) and more complicated cases, like browsers, which have their
own set of loading behaviours?
- Are these things addressable at the HTTP protocol layer, or do they
require being integrated into the application-layer fabric (like a Web
browser)? How will that interact with functionality like CORS and CORB, to
prevent cross-origin leakages?

Felix, you can see different approaches to the 'add dictionary compression'
have explored some of the design space above and chosen different things,
but as Patrick mentioned, there hasn't been a real sit down on the analysis
as to what the implications are of these decisions, their merits, and their
risks. I think Vlad's work is perhaps the closest one to feeling right
based on 'gut', but even in past IETF discussions, the uncertainty and
difficulty reasoning about that gut instinct has meant that a compression
scheme represents a large intellectual investment and time commitment.

To further build on why the status-quo may not be a reasonable bar, given
the profoundly negative interactions compression can have on the
confidentiality of the data being compressed, a parallel might be drawn to
TLS clients supporting 3DES or AES-CBC. These are ciphers or constructions
with known weaknesses and sharp edges, and require extreme care to get
right - but they are (or, at this point, were) widely deployed. Just
because they were widely deployed, however, wouldn't justify making those
same design decisions for new ciphersuites - as the TLS WG demonstrated
through TLS 1.3.

Deprecating HTTP compression support in browsers is, arguably, the right
thing to do. The number of organizations that can and have integrated
thorough analysis about the relation between 'public' and 'private' data
likely ranges in the handfuls, given how it keeps biting people. Yet its
widespread use means that, for practical purposes, we're in a rock and a
hard-place. Introducing new schemes in this space would have the
(personally) undesirable effect of encouraging more folks to adopt
compression, which would see even greater losses to confidentiality. Of
course, the high-order bit is getting the adoption of better
confidentiality protections in the first place - the adoption of TLS
instead of unencrypted HTTP and the adoption of TLS 1.3 are both more
pressing and relevant to keeping a vibrant and healthy Web ecosystem.

Understandably, I'm also biased towards the browser case, which deserves
acknowledging because it may be that the WG does adopt new compression
schemes as a work activity for intra-CDN activity or 'enterprise' or custom
bespoke applications. However, since most folks seem to be most keen on
HTTP compression due to the ability to save bytes to clients using web
browsers, it sets a higher bar than the status quo in order to be
compelling.

Hope that further expands on these concerns.

Received on Saturday, 22 September 2018 03:51:06 UTC