Re: Cost analysis: (was: Getting to Consensus: CONTINUATION-related issues) from Greg Wilkins on 2014-07-19 (ietf-http-wg@w3.org from July to September 2014)

From: Greg Wilkins <gregw@intalio.com>
Date: Sat, 19 Jul 2014 16:11:38 +1000
To: HTTP Working Group <ietf-http-wg@w3.org>
Cc: Mark Nottingham <mnot@mnot.net>, Roberto Peon <grmocg@gmail.com>
Message-ID: <CAH_y2NE4wC9hRBzCCi60CDAV-yt8MhUnacq7z=yNWeV827U6+w@mail.gmail.com>
Roberto,

With regard to roll back, I see no difference between the burden of roll
back in the sender vs the burden of enforced decoding in the receiver.   Ie
we currently have the issue that when a header limit is breached then the
receiver has to either continue to process anyway or discard the entire
connection.      By moving a compressed limit check to the sender, the
choice is much the same - roll back or discard the entire connection.

Moving the limit from receiver to sender makes this problem more explicit -
but it does not create the problem, which exists any way. Fundamentally any
limit that is applied after the encoding has started is too late - no
matter if it is in the sender or the receiver. Once encoding has started,
then you either have to continue to process the headers or discard the
entire connection.

The only way to efficiently handle a limit is to have it as a declared
uncompressed limit enforced by the sender before encoding starts. Only then
can failure be determine before committing to the entire
encoding/sending/decoding process.

Note that pretending that there is no limit by not declaring it, does not
solve the problem as there will always be a limit (or a massive DoS
vulnerability). Making the limit undeclared does not avoid the problem that
encoding has started.


In hind site,  I think we should have separated the issues of how do we
send large headers from the orthogonal  issue of how we limit large
headers, which are really orthogonal:

How do we transport large headers?:
a) Large Frames
b) Continuations
c) Fragmented Headers frame

How do we limit the max header size?
x) Expressed as a max compressed size (perhaps == a max frame size)
y) Expressed as a max uncompressed size
z) No declared limit (but receivers may apply a limit with 431 or GO-AWAY)

I think any of the limits can be applied to any of the transports.

Mark - is it too late to re frame the consensus questions?  Have you been
able to see any clarity in the other thread?

For the record my preferences are   c,a,b   &  y,x,z,    but I can live
with all.

cheers












On 19 July 2014 14:19, Roberto Peon <grmocg@gmail.com> wrote:

>
>
> On Fri, Jul 18, 2014 at 8:10 PM, Amos Jeffries <squid3@treenet.co.nz>
> wrote:
>
>> On 19/07/2014 7:37 a.m., Poul-Henning Kamp wrote:
>> > In message <CABkgnnWmBUNKFDH8JKz8GKRgZDaS=1f6yQ0C6CdF_zv=
>> QnPR8A@mail.gmail.com>
>> > , Martin Thomson writes:
>> >
>> >> I find that selectively trivializing various aspects of the space
>> >> isn't particularly constructive.
>> >
>> > I agree.  Misinformation is also bad.
>> >
>> >> On the one side:
>> >>
>> >> CONTINUATION has a cost in code complexity.  It defers the discovery
>> >> of what might be a surprisingly large amount of state.
>> >
>> > And even though CONTINUATION in themselves do not imply or cause
>> > any limit to exist, all implementations are going to have limits,
>> > one way or another.  What these limits might be is anyones guess
>> > at this point, but HTTP/1 deployments might be indicative.
>> >
>> > Reception of CONTINUATION frames carries a cost in complexity for
>> > memory and buffer management, independent of there being limits or
>> > not.
>> >
>> > CONTINUATIONS are significantly more complext to describe in the
>> > draft (compare red/green in the Greg at all draft).
>> >
>> >> On the other:
>> >>
>> >> A hard cap on size (i.e., option A) has a cost in code complexity.
>> >
>> > I pressume you mean ... over option B) ?
>> >
>> > If so, it will be quite the contrary:  Both senders and receivers
>> > will have much simpler memory management and far less state to keep
>> > track of with complete header-sets in a single frame.
>> >
>> >> It requires that encoders be prepared to double their state commitment
>> so
>> >> that they can roll back their header table when the cap is hit.
>> >
>> > No, it doesn't, the encoders can just sacrifice the connection and
>> > open another, which will be an incredibly common implementation
>> > because header-sets larger than the other side is willing to accept
>> > are going to be incredibly rare, and primarily caused by attacks.
>>
>> That connection behavour is severe overkill. The rollback argument is a
>> strawman.
>
>
>> HPACK compresssed size can be calculated as frame_length < sum(2+
>> header_size) where header_size is the length of an individual header in
>> HTTP/1.1 size, and all header values in static table redux to '1'.
>> Under option A senders are required to buffer up to N bytes of header
>> un-compressed (their choice of N), plus a buffer of same size for
>> compressing into.
>>
>>
> A 'compressed' header could be larger than the original (4 times larger
> for some values), or it could be much smaller.
> I don't follow your calculation.
>
>
>> If the un-compressed version exceeds what the local end is willing to
>> buffer OR the above formula output exceeds what the remote end is
>> willing to receive - then the frame does not need to be compressed at
>> all. Ergo, no rollback required.
>>
>>
> It implies that a rollback is not required if one is willing to not
> compress in a large number of cases where compression would have been
> highly effective.
>
>
>> What *is* required is that buffer space to hold the un-compressed
>> headers AND incompletely compressed headers simultaneously during
>> compression.
>>
>>
> And it requires that we destroy the value proposition of compression,
> assuming we do it as suggested here, since there will be a large percentage
> of the time where we'd not compress when compressing would have been
> successful.
>
>
>
>> The solution to this appears to be allowing senders to output estimate
>> frames size as frame length value then pad with suffix octets up to that
>> if it somehow compresses smaller than estimated. Such frames could be
>> streamed directly to the recipient with specific size up front and no
>> additionanl buffer requirements.
>>
>
> The whole point of compression is to send fewer bytes, and with this
> proposition, we're almost always guaranteed not to do that.
>
> -=R
>
>
>>
>> Amos
>>
>>
>


-- 
Greg Wilkins <gregw@intalio.com>
http://eclipse.org/jetty HTTP, SPDY, Websocket server and client that scales
http://www.webtide.com  advice and support for jetty and cometd.
Received on Saturday, 19 July 2014 06:12:07 UTC