Re: Making Implicit C-E work. from Martin Thomson on 2014-05-15 (ietf-http-wg@w3.org from April to June 2014)

From: Martin Thomson <martin.thomson@gmail.com>
Date: Wed, 14 May 2014 18:48:05 -0700
To: Roberto Peon <grmocg@gmail.com>
Cc: Johnny Graettinger <jgraettinger@chromium.org>, Matthew Kerwin <matthew@kerwin.net.au>, C.Brunhuber@iaea.org, HTTP Working Group <ietf-http-wg@w3.org>, K.Morgan@iaea.org
Message-ID: <CABkgnnV8z=4NJ4TNxkqrAm13OWj2Zv9vhvvk=E_8w=8ouXQeyA@mail.gmail.com>
Yep, it's several orders of magnitude more memory, worst case, than not
decompressing. Which seems like it might well be the corner case.

But misbehavior, or simple overloading, have relatively straightforward
detection and mitigation techniques.  Well behaved clients, which form the
bulk of everyday requests, should require a minimal commitment to support.
I doubt that you would have much more than 100k open streams requiring
decompression, even with a total pool of 10 million connections. A few
Gbytes is hardly a major capital outlay.

Maybe I'm completely wrong, and you can prove it, but those numbers just
don't scare me that much.

I am down with Matthew's suggestion here to lift the MUST. Though I note
that there are cases where this places an intermediary between a rock and a
hard place in terms of conflicting requirements. Maybe we can note that an
intermediary /could/ decompress and leave the normative text out of it.
On May 14, 2014 3:58 PM, "Roberto Peon" <grmocg@gmail.com> wrote:

> inflate size-computation is: (1 << windowBits) + 1440*2*sizeof(int)
> windowBits can be as large as 15, thus at least 44k
>
> So, we have each stream potentially allocating and keeping around 44k, and
> an unbounded or large number of streams.
>
> The client can largely decide how many streams to create and it can
> (guaranteed) leave them idle for as long as it wants (until the proxy
> finally closes the stream).
> Yes, the server can reduce the max parallelism, but then the protocol is
> not doing what it was intended to do.
>
> With 100 streams, we're talking about 4.22 megabytes of uncontrolled
> buffer.
> That is *WAY* more than the typical static-cost for a connection, which
> the server can otherwise control to a few bytes, and perhaps two orders of
> magnitude larger than typical/desired in-use costs for a connection.
>
> Even better, the proxy may be forced into this scenario because of flow
> control, when it is unable to forward bytes onward to the destination.
> -=R
>
>
> On Wed, May 14, 2014 at 2:18 PM, Martin Thomson <martin.thomson@gmail.com>wrote:
>
>> Maybe someone needs to use small words or something, but to me, this DoS
>> claim is this far poorly supported in fact from my reading.
>>
>> Assuming that you employ flow control, the only commitment an
>> intermediary makes when decompressing is the decompression context for each
>> stream. That is bounded in size too. That might be larger than people might
>> like, but I'm not all that excited about a new protocol that is more
>> expensive to operate than an old one in some corner case.
>>  On May 14, 2014 1:37 PM, <K.Morgan@iaea.org> wrote:
>>
>>> On 14 May 2014 17:52, jgraettinger@google.com wrote:
>>> > On 14 May 2014, Keith Morgan wrote:
>>> >> Nearly all of the problems brought up w.r.t. t-e gzip are also true
>>> for C-E
>>> >> gzip. This one is no exception.
>>> >> Implicit C-E gzip adds a large DoS attack surface to intermediaries.
>>> >> An attacker simply sends requests without ‘accept-encoding: gzip’
>>> from a HTTP/1.1.
>>> >> origin through a 1.1<->2 gateway for a resource it knows the server
>>> will
>>> >> implicitly C-E gzip. Then (using your own words Johnny) “the gateway
>>> is required
>>> >> to decompress before handing off to the origin … and face the ensuing
>>> DoS attack
>>> >> surface.”
>>> >
>>> >
>>> > This is a problem which already exists today. As Roberto noted,
>>> existing h1
>>> > intermediaries add a-e: identity on behalf of clients who didn't ask
>>> for it, and
>>> > existing servers send c-e: gzip anyway, because the observed impact of
>>> not
>>> > compressing is higher than the interop issues it introduces.
>>> >
>>> >
>>> > The DOS mitigation which is available today, and continues to be
>>> available for
>>> > h2/h1 gateways under implicit c-e: gzip, is to pass through what the
>>> server sent
>>> > without decompression. There's a high likelihood the h1 client will be
>>> able to
>>> > handle c-e: gzip  (and even prefers it), which is definitely not true
>>> if t-e: gzip
>>> > were used instead.
>>>
>>>
>>> The current -12 spec says "Intermediaries that perform translation from
>>> HTTP/2 to
>>> HTTP/1.1 MUST decompress payloads unless the request includes an
>>> Accept-Encoding
>>> value that includes 'gzip'." [1].
>>>
>>> So implicit gzip either introduces a legitimate DoS at the HTTP/2 to
>>> HTTP/1.1 gateway
>>> or you want gateways to ignore this requirement and introduce interop
>>> issues. Which
>>> is it?
>>>
>>> If it's the latter (forward compressed regardless of the A-E header),
>>> then what's the
>>> point of Roberto's "uncompressed-*" headers for the gateways if you
>>> don't really want
>>> the gateways to deccompress? And what's the point of this "MUST
>>> decompress payloads"
>>> requirement?
>>>
>>> If it's the former (decompress at the gateway), and you want to live
>>> with the DoS
>>> attack surface, I have to ask again, what's the point of all this - just
>>> to have
>>> compression on one or two h2 hops and then force gateways to dynamically
>>> decompress??
>>>
>>>
>>> [1] http://http2.github.io/http2-spec/#Compression
>>> This email message is intended only for the use of the named recipient.
>>> Information contained in this email message and its attachments may be
>>> privileged, confidential and protected from disclosure. If you are not the
>>> intended recipient, please do not read, copy, use or disclose this
>>> communication to others. Also please notify the sender by replying to this
>>> message and then delete it from your system.
>>>
>>>
>
Received on Thursday, 15 May 2014 01:48:34 UTC