Re: Stuck in a train -- reading HTTP/2 draft. from Greg Wilkins on 2014-06-24 (ietf-http-wg@w3.org from April to June 2014)

From: Greg Wilkins <gregw@intalio.com>
Date: Tue, 24 Jun 2014 12:07:59 +0200
To: Willy Tarreau <w@1wt.eu>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CAH_y2NHSL3+zHtNo8EU1rHCtiR7fSEtOGC-9Am3iznUo4ibqZg@mail.gmail.com>
Willy,

Do you have a feel for how much processing costs would affect such proxies
if they still ran with 256kB buffers, but which contained many 16kB frames?

Is the problem here not so much the frame size, but the proxies requirement
to decode/inspect frames?

ie what would be the impost of the proxy read a 256KB buffer, quickly
scanned it for any new streams, and if none found just forwarded to entire
256KB buffer?

cheers






On 24 June 2014 08:28, Willy Tarreau <w@1wt.eu> wrote:

> Hi Mark,
>
> On Tue, Jun 24, 2014 at 04:09:46PM +1000, Mark Nottingham wrote:
> > Hi PHK,
> >
> > On 23 Jun 2014, at 8:05 pm, Poul-Henning Kamp <phk@phk.freebsd.dk>
> wrote:
> >
> > > In message <
> 0356EBBE092D394F9291DA01E8D28EC201186DF063@sem002pd.sg.iaea.org>, K
> > > .Morgan@iaea.org writes:
> > >> On Sunday,22 June 2014 14:36, phk@phk.freebsd.dk wrote:
> > >
> > >>>> I realise I should probably clarify my thoughts on what to do if a
> > >>>> single header doesn't fit in a 16K frame.  The option I like best
> comes
> > >>>> from one of PHK's earlier posts, where one of the reserved bits in
> the
> > >>>> frame header is used as a "jumbo frame" marker such that if it's set
> > >>>> the first, say, four octets of payload space is actually an extra 32
> > >>>> bits of payload length
> > >>>
> > >>> I would have it be the max length of *any* frame we're willing to
> accept,
> > >>> and the default would then obviously be the 16kbyte currently
> implicit in
> > >> the standard.
> > >>
> > >> So are you proposing the "jumbo frame" marker for all frames, not
> just the
> > >> HEADERS frames?  I think it's a great idea, but I know it makes a
> bunch of
> > >> people nervous about HOL blocking if you allow more than 16K in a
> DATA frame.
> > >
> > > Yes, the length-extension would be available on all frames, which is
> why
> > > we need a SETTING to limit what we'll accept in that respect.
> > >
> > > For huge file transfers the 16k frames are horribly suboptimal and
> > > having the receiver bang the frame size up once "Content-Length: A_LOT"
> > > has been received will do wonders for performance on both ends.
> > >
> > > Obviously, you can also reduce the frame size you'll accept.  16K
> > > is quite large for a number of high traffic sites prone to DoS.
> >
> > This has been discussed a lot over the life of the WG. The place where
> we left it was that the overhead of framing was quite small, considering
> that it's 8 bytes over 16K; TCP overheads are usually going to be bigger.
> >
> > It's true that you can't use sendfile() here, but that's true with
> multiplexing regardless. It was felt that over time, kernel facilities
> specific to the use case of HTTP/2 will emerge if necessary, just as they
> did for HTTP/1.
> >
> > Is there something else behind "horribly suboptimal" here? Can you give
> some numbers?
>
> I know some high traffic sites running with haproxy, above 100 Gbps. Such
> sites don't ever make use of concurrent streams, because as PHK calls them,
> they're delivering pink pixels over the net. At these rates, the problem is
> not TCP overhead or any such thing, but the processing cost. A single
> haproxy
> node can forward data at 40 Gbps with 256kB buffers, 35 Gbps with 64kB
> buffers
> and something around 20 Gbps with 16kB buffers. At 16 kB buffers and 20
> Gbps,
> that's 150000 recv/send per second. That's extremely inefficient CPU cache-
> wise and requires a lot of context switching. I'd say in fact that the task
> processing overhead becomes huge compared to the small cost of copying or
> even just splicing data between two ends. Memory bandwidth is huge with
> todays processors, and were seeing the 100 Gbps NICs coming, so large
> data blocks are processed with a low cost. NICs are capable of doing large
> receive offloading and TCP segmentation offloading, so it's possible for
> the TCP stack to process 64kB packets which are much cheaper for the stack
> than 16kB packets.
>
> So that's really a problem of data processing overhead on top of cheap
> forwarding.
>
> I would find it sad that the sites responsible for something like 75% of
> the
> internet's traffic refrain from upgrading because of the extra
> infrastructure
> costs :-/
>
> Regards,
> Willy
>
>
>


-- 
Greg Wilkins <gregw@intalio.com>
http://eclipse.org/jetty HTTP, SPDY, Websocket server and client that scales
http://www.webtide.com  advice and support for jetty and cometd.
Received on Tuesday, 24 June 2014 10:08:37 UTC