Re: Pipeline hinting revisited from Willy Tarreau on 2011-08-12 (ietf-http-wg@w3.org from July to September 2011)

From: Willy Tarreau <w@1wt.eu>
Date: Fri, 12 Aug 2011 07:53:27 +0200
To: Darin Fisher <darin@chromium.org>
Cc: Brian Pane <brianp@brianp.net>, ietf-http-wg@w3.org
Message-ID: <20110812055327.GE9902@1wt.eu>
On Thu, Aug 11, 2011 at 10:43:36PM -0700, Darin Fisher wrote:
> On Thu, Aug 11, 2011 at 10:31 PM, Willy Tarreau <w@1wt.eu> wrote:
> 
> > Hi Brian,
> >
> > On Thu, Aug 11, 2011 at 03:12:31PM -0700, Brian Pane wrote:
> > > I've been thinking some more about request pipelining recently,
> > > triggered by several observations:
> > >
> > > - A significant number of real-world websites could be made faster via
> > > widespread adoption of request pipelining (based on my study of
> > > ~15,000 sites in the httparchive.org corpus).
> > > - A nontrivial fraction of mobile browsers are using pipelining
> > > already, albeit not as aggressively as they could (based on Blaze's
> > > study: http://www.blaze.io/mobile/http-pipelining-big-in-mobile/ )
> > > - Client implementations that currently pipeline their requests are
> > > using heuristics of varying complexity to try to decide when
> > > pipelining is safe.  The list of conditions documented here is at the
> > > complex end of the spectrum, and it's perhaps still incomplete:
> > > https://bugzilla.mozilla.org/show_bug.cgi?id=599164
> > >
> > > The key question, I think, is whether heuristics implemented on the
> > > client side will end up being sufficient to detect safe opportunities
> > > for pipelining.  If not, a server-driven hinting mechanism of the sort
> > > proposed in Mark's "making pipelining usable" draft (
> > > http://tools.ietf.org/html/draft-nottingham-http-pipeline-01 ) seems
> > > necessary.
> > >
> > > Anybody have additional experimental data on pipelining (including the
> > > effectiveness of heuristics for turning pipelining on or off) that
> > > they can share?
> >
> > We've been conducting some tests for a customer working with mobile
> > terminals. I was very frustrated to see that pipelining did not bring
> > any gain there due to the first non-pipelined request to each host.
> > What happens is that there are many objects on a page, spread over
> > many hosts. The terminal opens many parallel connections to these
> > hosts, and as a result, there are 4-5 objects max to fetch over each
> > connection. All connection have a first object fetched alone, and only
> > once a response is received, a batch of 4 requests is sent. It is this
> > pause between the first and the next request over a connection which
> > voids the gain. It was always faster to open more parallel connection,
> > despite the extra bandwidth, than to use pipeline, precisely due to
> > this point.
> >
> > This is why I think we need to find a solution so that pipelining could
> > be more aggressive on riskless requests, and possibly use the server
> > side's hinting to safely fall back to non-pipelining ASAP if needed.
> > I'm well aware that the biggest issue seems to be with broken servers
> > getting stuck between requests. I don't know if there are many of those
> > or not, but maybe at one point it will become those site's problem and
> > not the browsers'.
> >
> 
> Often it is a bad intermediary (transparent proxy).  The origin server may
> be just as helpless as the client :-(

I agree, and my point is that it's the first intermediary between the
client and the origin server that counts for the client, and most often
this intermediary will support pipelining and it's too bad not to use
it with the whole world. Anyway in both cases (good or bad intermediary),
both the server and the client will have little clue and be of little
help. Ideally we should have request numbers for the connection, but as
Mark pointed it out a few months ago, those might be cached and make the
issue worse. That said, are we sure they might be cached even if advertised
in the Connection header ? I don't really think so. If we had intermediaries
or server respond with "Connection: pipeline; req=#num", or even
"Connection: req=#num", it should make it clear where it's explicitly
supported, without waiting for the whole world to adopt it on every
server. The advantage with Connection is that intermediaries will get
rid of it.

Regards,
Willy
Received on Friday, 12 August 2011 05:53:55 UTC