Re: Backwards compatibility from Poul-Henning Kamp on 2012-03-31 (ietf-http-wg@w3.org from January to March 2012)

From: Poul-Henning Kamp <phk@phk.freebsd.dk>
Date: Sat, 31 Mar 2012 08:28:52 +0000
To: Willy Tarreau <w@1wt.eu>
cc: Roberto Peon <grmocg@gmail.com>, Mark Watson <watsonm@netflix.com>, Mike Belshe <mike@belshe.com>, "William Chan (?????????)" <willchan@chromium.org>, "<ietf-http-wg@w3.org>" <ietf-http-wg@w3.org>
Message-ID: <11441.1333182532@critter.freebsd.dk>

In message <20120331081406.GP14039@1wt.eu>, Willy Tarreau writes:
>On Sat, Mar 31, 2012 at 07:30:12AM +0000, Poul-Henning Kamp wrote:

>> If we imagine the perfectly optimal behaviour from the network
>> stack, and perfectly optimal HTTP message from the other end, the
>> perfect protocol scenario looks like this:
>> 
>>         [length of head]
>>         [head]
>>         [length of body]
>>         [body]
>
>I'm still having a problem with this scheme, it is most of the requests
>don't have a body,

If there is no body, you can (should be able to!) see that out of
the head, and you obviously will not try to read the body which is
not there.

>> Under utterly perfect circumstanses, just three socket reads will
>> get you the head and body into memory chosen, sized, aligned &
>> allocated perfectly for the purpose:
>> 
>>         READ(length header) ->len buffer
>>         (allocate workspace)
>>         READV(head + next length header) -> (workspace, len buffer)
>>         (allocate bodyspace)
>>         READ(body) -> bodyspace
>
>No, under perfect situations, a single readv() would give you all the
>parts you need with fixed sizes,

There is no way that can work, because you dont know how many headers
the message has, and therefore you cannot know how many bytes to read
beforehand, unless the other end tells you.

If you read too many, you get part of the body also, and have to move
that to its proper memory space with a memcpy().

The important words in my description above are "three", "chosen",
"sized", "aligned" and "allocated", and they are just about equally
important once we get above 1Gbit/sec.

>> Any protocol which by design requires more work to move the bits from
>> the TCP connection, through the socket API and into the applications
>> memory, is not a high-performance protocol, worthy of HTTP/2.0
>> consideration.
>
>I agree that we must avoid memory copies as much as possible, but a read()
>is a system-assisted memory copy.

A system-assisted *unavoidable* memory copy, so we should make the most
of it, rather than have to move stuff again.

>> Notice that just doing:
>>         READ(1GB)
>> might get you the same data into memory, but you can not optimally
>> place it in memory without memcpy'ing it around.
>
>Hopefully we'll not see a 1GB header soon !

My example above reads header & body.  4.7GB objects surprisingly
common.

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.

Received on Saturday, 31 March 2012 08:29:17 UTC