Re: Making Implicit C-E work. from Roberto Peon on 2014-05-15 (ietf-http-wg@w3.org from April to June 2014)

From: Roberto Peon <grmocg@gmail.com>
Date: Wed, 14 May 2014 19:08:59 -0700
To: Martin Thomson <martin.thomson@gmail.com>
Cc: Johnny Graettinger <jgraettinger@chromium.org>, Matthew Kerwin <matthew@kerwin.net.au>, C.Brunhuber@iaea.org, HTTP Working Group <ietf-http-wg@w3.org>, K.Morgan@iaea.org
Message-ID: <CAP+FsNeh8YvZE2zq_sSCKVZDAoqq_Ro7MLzoWwwNY8by68yXSg@mail.gmail.com>
A site can always simply RST every stream that has that bit set, I suppose.
It would have the effect of killing the feature, but that would be a
reasonable compromise to ensure reliability.

-=R


On Wed, May 14, 2014 at 6:48 PM, Martin Thomson <martin.thomson@gmail.com>wrote:

> Yep, it's several orders of magnitude more memory, worst case, than not
> decompressing. Which seems like it might well be the corner case.
>
> But misbehavior, or simple overloading, have relatively straightforward
> detection and mitigation techniques.  Well behaved clients, which form the
> bulk of everyday requests, should require a minimal commitment to support.
> I doubt that you would have much more than 100k open streams requiring
> decompression, even with a total pool of 10 million connections. A few
> Gbytes is hardly a major capital outlay.
>
> Maybe I'm completely wrong, and you can prove it, but those numbers just
> don't scare me that much.
>
> I am down with Matthew's suggestion here to lift the MUST. Though I note
> that there are cases where this places an intermediary between a rock and a
> hard place in terms of conflicting requirements. Maybe we can note that an
> intermediary /could/ decompress and leave the normative text out of it.
> On May 14, 2014 3:58 PM, "Roberto Peon" <grmocg@gmail.com> wrote:
>
>> inflate size-computation is: (1 << windowBits) + 1440*2*sizeof(int)
>> windowBits can be as large as 15, thus at least 44k
>>
>> So, we have each stream potentially allocating and keeping around 44k,
>> and an unbounded or large number of streams.
>>
>> The client can largely decide how many streams to create and it can
>> (guaranteed) leave them idle for as long as it wants (until the proxy
>> finally closes the stream).
>> Yes, the server can reduce the max parallelism, but then the protocol is
>> not doing what it was intended to do.
>>
>> With 100 streams, we're talking about 4.22 megabytes of uncontrolled
>> buffer.
>> That is *WAY* more than the typical static-cost for a connection, which
>> the server can otherwise control to a few bytes, and perhaps two orders of
>> magnitude larger than typical/desired in-use costs for a connection.
>>
>> Even better, the proxy may be forced into this scenario because of flow
>> control, when it is unable to forward bytes onward to the destination.
>> -=R
>>
>>
>> On Wed, May 14, 2014 at 2:18 PM, Martin Thomson <martin.thomson@gmail.com
>> > wrote:
>>
>>> Maybe someone needs to use small words or something, but to me, this DoS
>>> claim is this far poorly supported in fact from my reading.
>>>
>>> Assuming that you employ flow control, the only commitment an
>>> intermediary makes when decompressing is the decompression context for each
>>> stream. That is bounded in size too. That might be larger than people might
>>> like, but I'm not all that excited about a new protocol that is more
>>> expensive to operate than an old one in some corner case.
>>>  On May 14, 2014 1:37 PM, <K.Morgan@iaea.org> wrote:
>>>
>>>> On 14 May 2014 17:52, jgraettinger@google.com wrote:
>>>> > On 14 May 2014, Keith Morgan wrote:
>>>> >> Nearly all of the problems brought up w.r.t. t-e gzip are also true
>>>> for C-E
>>>> >> gzip. This one is no exception.
>>>> >> Implicit C-E gzip adds a large DoS attack surface to intermediaries.
>>>> >> An attacker simply sends requests without ‘accept-encoding: gzip’
>>>> from a HTTP/1.1.
>>>> >> origin through a 1.1<->2 gateway for a resource it knows the server
>>>> will
>>>> >> implicitly C-E gzip. Then (using your own words Johnny) “the gateway
>>>> is required
>>>> >> to decompress before handing off to the origin … and face the
>>>> ensuing DoS attack
>>>> >> surface.”
>>>> >
>>>> >
>>>> > This is a problem which already exists today. As Roberto noted,
>>>> existing h1
>>>> > intermediaries add a-e: identity on behalf of clients who didn't ask
>>>> for it, and
>>>> > existing servers send c-e: gzip anyway, because the observed impact
>>>> of not
>>>> > compressing is higher than the interop issues it introduces.
>>>> >
>>>> >
>>>> > The DOS mitigation which is available today, and continues to be
>>>> available for
>>>> > h2/h1 gateways under implicit c-e: gzip, is to pass through what the
>>>> server sent
>>>> > without decompression. There's a high likelihood the h1 client will
>>>> be able to
>>>> > handle c-e: gzip  (and even prefers it), which is definitely not true
>>>> if t-e: gzip
>>>> > were used instead.
>>>>
>>>>
>>>> The current -12 spec says "Intermediaries that perform translation from
>>>> HTTP/2 to
>>>> HTTP/1.1 MUST decompress payloads unless the request includes an
>>>> Accept-Encoding
>>>> value that includes 'gzip'." [1].
>>>>
>>>> So implicit gzip either introduces a legitimate DoS at the HTTP/2 to
>>>> HTTP/1.1 gateway
>>>> or you want gateways to ignore this requirement and introduce interop
>>>> issues. Which
>>>> is it?
>>>>
>>>> If it's the latter (forward compressed regardless of the A-E header),
>>>> then what's the
>>>> point of Roberto's "uncompressed-*" headers for the gateways if you
>>>> don't really want
>>>> the gateways to deccompress? And what's the point of this "MUST
>>>> decompress payloads"
>>>> requirement?
>>>>
>>>> If it's the former (decompress at the gateway), and you want to live
>>>> with the DoS
>>>> attack surface, I have to ask again, what's the point of all this -
>>>> just to have
>>>> compression on one or two h2 hops and then force gateways to
>>>> dynamically decompress??
>>>>
>>>>
>>>> [1] http://http2.github.io/http2-spec/#Compression
>>>> This email message is intended only for the use of the named recipient.
>>>> Information contained in this email message and its attachments may be
>>>> privileged, confidential and protected from disclosure. If you are not the
>>>> intended recipient, please do not read, copy, use or disclose this
>>>> communication to others. Also please notify the sender by replying to this
>>>> message and then delete it from your system.
>>>>
>>>>
>>
Received on Thursday, 15 May 2014 02:09:32 UTC