Re: Dictionary Compression for HTTP (at Facebook) from Kenji Baheux on 2018-09-03 (ietf-http-wg@w3.org from July to September 2018)

From: Kenji Baheux <kenjibaheux@chromium.org>
Date: Mon, 3 Sep 2018 11:52:52 +0900
To: ryan-ietf@sleevi.com
Cc: Jyrki Alakuijala <jyrki@google.com>, chaals@yandex-team.ru, Evgenii Kliuchnikov <eustas@google.com>, felixh@fb.com, Mark Nottingham <mnot@mnot.net>, terrelln@fb.com, Vlad Krasnov <vlad@cloudflare.com>, cyan@fb.com, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CADWWn7Wk+1uP1CjChTwuoULoR_goj4qW7Vd8pu2PqAf+EhuKTg@mail.gmail.com>

Is the assumption that Servers / Server operators, unless vertically
integrated (e.g. FB and other big players), will probably get it wrong
because they lack the application knowledge to dissociate public vs.
private sources?

One option that was discussed in a side chat at IETF 101 was to expose this
to JS (Fetch / Streams / ...) where the knowledge about public vs. private
exists. But that implies a different, and most likely harder, adoption
story.

The scale benefits of Servers / Server Operator is definitely appealing.
So, I'm wondering if we have exhausted all the reasonable options. Is there
nothing that would at least cover a significant subset of the use cases?

Some naive questions / thoughts:

   - Would it still be valuable to enable shared dictionary compression by
   Servers for responses that are completely free of private data?
   - Can a Server / Server operator infer that response isn't tainted by
   private sources, i.e. only contains "public" data? From the absence of
   specific (existing) headers?
   - If not, what if Servers had to assume "mixed" unless the response was
   augmented with a new(?) header. I imagine that a server configuration
   scheme / header would be simpler in terms of adoption than requiring web
   developers to do all the work in JS land.




On Sun, Sep 2, 2018 at 4:57 AM Ryan Sleevi <ryan-ietf@sleevi.com> wrote:

>
>
> On Sat, Sep 1, 2018 at 1:23 AM Jyrki Alakuijala <jyrki@google.com
> <jyrki@google..com>> wrote:
>
>> On Fri, Aug 31, 2018 at 4:58 PM, Ryan Sleevi <ryan-ietf@sleevi.com>
>> wrote:
>>>
>>> Of course, this is all after the security concerns are mitigated ;)
>>>
>>
>> We involved Thai Duong in the security analysis and we have a limited
>> scope solution that allows much of the benefits without these security
>> concerns. One main mitigation there is to never compress data mixed from
>> public and private sources.
>>
>
> Yes. That’s been well understood and well discussed as the bare minimum -
> but that requires servers understanding what constitutes public and private
> sources, or the interaction between data that may need to remain private
> through timing leaks.
>
> I don’t think it’s fair to say without the security concerns - there’s a
> considerably high bar to demonstrate that, both in theory and in practice.
> A compression scheme that requires serves dramatically rework their serving
> infrastructure in order to tag such is, generally speaking, an insecure
> solution. The adoption of a given method is highly correlated to its lack
> of footguns, and tagged annotations of public v private are a giant footgun
> for server operators and not worth the risk to users, or the implicit trust
> that servers and server operators can and will get it right.
>
>>

Received on Monday, 3 September 2018 02:53:28 UTC