Re: delta encoding and state management from Willy Tarreau on 2013-01-22 (ietf-http-wg@w3.org from January to March 2013)

From: Willy Tarreau <w@1wt.eu>
Date: Tue, 22 Jan 2013 23:46:46 +0100
To: "William Chan (?????????)" <willchan@chromium.org>
Cc: James M Snell <jasnell@gmail.com>, Nico Williams <nico@cryptonector.com>, Roberto Peon <grmocg@gmail.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Message-ID: <20130122224646.GO30692@1wt.eu>

On Tue, Jan 22, 2013 at 01:48:42PM -0800, William Chan (?????????) wrote:
> This is an intriguing counterproposal. Perhaps we should fork the
> thread to discuss it?

maybe, yes.

> I'd still like to get an answer here about what
> folks think about the acceptability of the rough costs of stateful
> compression.

It's hard to get opinions on what is considered as "heavy", it depends
a lot on what you're doing. In haproxy we default to 2*16kB of buffers
plus around 1 kB of state per connection. Some people are already
pressuring me to get rid of the 2*16kB for websocket connections so
that they don't need 32 GB of RAM per million connection.

And I know other contexts where completely buffering a 5 MB POST request
before passing it to a server is considered fairly acceptable.

One of the issue comes from the fact that no limit is suggested for how
large a request can be, and due to this we generally have to allocate
large amounts of resources "just in case". This is what is problematic
with whatever is stateful. However, I do think that being able to buffer
a full request will be acceptable for any agent (client, intermediary,
server), because all of them have to see the full request at least once,
and its buffers are adequately sized for this. If part of this request
is reused for next requests, there is no need for allocating more memory
and it's a win at the same time.

What is difficult to judge is how much we need to store for compression
states which have to be stored in addition to the request itself. As a
rule of thumb, I'd guess that doubling the whole state is quite annoying
but still manageable.

> One issue I see in this proposal is that, as always, it is difficult
> to predict the future. You don't know when you're parsing the document
> when you'll discover a new resource to request.

I don't understand what you mean here.

> How long do you delay
> the resource request in order to consolidate requests into a load
> group? The same thing is even more true for response headers.

I never want to delay anything, delays only do bad things when we
try to reduce latency.

In the example I proposed, the recipient receives the full headers
block, then from that point, all requests reuse the same headers
and can be processed immediately (just like pipelining in fact).

Concerning response headers, I'd say that you emit a first response
group with the headers from the first response, followed by the
response. When another response comes in, you have two possibilities,
either it shares the same headers and you can add a response to the
existing group, or it does not and you open a new group.

But that said, I would not spend too much energy trying to optimize
response headers. Right now they're commonly less important because
often accompanied with data and also because the downstream link
generally is much bigger than the upstream one. Still that's not
something to ditch either.

Regards,
Willy

Received on Tuesday, 22 January 2013 22:47:15 UTC