Re: About draft-nottingham-http-pipeline-01.txt

On 14/03/2011, at 12:21 PM, Willy Tarreau wrote:

> Hello Mark,

Hi Willy.

[snip]

> Pipelining issues are connection-specific : a client may decide to
> pipeline or not to pipeline over a given connection. The client
> does not know whether there are transparent proxies in the chain
> (or "interception proxies", let's call them all "transparent" for
> the sake of easier explanation).
> 
> When those transparent proxies are specific to the site the client
> is visiting, it can make sense to rely on Assoc-Req, because after
> all, it's the site's admin responsibility to ensure that their
> servers will correctly build the header. In fact there's another
> issue on this point, let's see later.

Sorry, a transparent/intercepting proxy that's specific to a site? Are you talking about gateways ("reverse proxies") or are you saying that some networks are selectively deploying proxies that are only used for accessing certain sites?

I'm aware of a few content firewalls that work this way (e.g., in Australia, the much-ballyhoo'd Internet Filter would redirect requests to certain IPs to an intercepting proxy) -- are you thinking of this, or something else?


> But when the transparent proxies are on the client side, the header
> basically brings no information.

The draft (clearly, I hope!) conveys a strategy for dealing with interception proxies; do you have any feedback on that? Assoc-Req is not intended to address all of the problems associated with them.

[snip]

> I see an easy solution to this : transparent proxies on the client
> side will have to be modified to 1) remove any Assoc-Req header from
> responses, and 2) forge it themselves to send a valid-looking response
> to the client. However, this is contrary to what is specified in the
> draft. Given the small time it takes to upgrade client-side proxies
> at places such as ISPs, and given the noticeable benefits for every
> internet site, I'm quite sure that every operator will do it whatever
> is written in the spec.

I see your point about mobile networks -- they may want to optimise the connection between the browser and proxy, since they're more latency-sensitive than most other intermediary administrators. However, it seems to me that this is a specific client->proxy optimisation, not a general one that replaces the other mechanisms in the draft. 

Let me have a think about it and perhaps we can come up with something that helps pipelining to next-hop proxies without disturbing other use cases.


> Now for the server side, we're suggesting adding the header on the
> servers themselves in order to validate the whole chain. I see two
> difficulties with that :
>  - large sites take more time to be modified, even in order to
>    add just a header ;

Do you have any data to back this up? In my experience, this is not trivial, but it is workable, especially when you dangle a substantial performance improvement as a carrot.


>  - it's more and more common on the server side to "route" requests
>    via various layers of application proxies and servers, and URLs
>    are rewritten, mapped, prefixed, or have their prefixes stripped.
>    It's enough to see the damage done on the Location response header
>    to understand what type of molestation they support on the request
>    path. In these environments, the cost of adding an Assoc-Req header
>    will be high compared to the perceived benefits (application authors
>    test them on the local network anyway).

I'll treat this as a response to the note requesting feedback on the design of the Assoc-Req value in -01.


> Thus it will be difficult (read expensive) to reliably add this header
> on the server side for a little perceived benefit. Also, it will be
> enough to get it wrong only once for the site to be blacklisted by
> many clients anyway, rendering the efforts useless. Also, there will
> be issues with duplicate headers, URLs containing commas being handled
> differently by some clients when presented in the Assoc-Req header,
> etc...

Indeed, why try at all? I'm sure you'll be able to find problems of this nature with most any proposal someone makes.


> Another point I'm seeing is on the efficiency and deployment speed.
> I don't know how many sites there are on the net, but getting all of
> the valid ones emit the header will take ages. We can relate that to
> the number of sites which support keep-alive and HTTP compression.

Yet, strangely, many sites do deploy keep-alive and compression, and enjoy the benefits.


> The main reason is that there is little incentive on the server side
> to work on this, because the benefits are not directly perceived.

?!?! I know of many server admins who salivate at the potential performance benefits that this brings. It's a huge incentive. 

[snip]

> That means that we can address most of the pipelining deployment issues
> by targetting the client side and providing a real perceived benefit to
> those who will deploy the feature, and it should concern more and more
> internet users in very little time, because there are people willing to
> push that mechanism forwards.

Yes, this is why I've been working with browser vendors, and as you may know, my employer has no small concern in assuring that its considerable array of content is delivered quickly.


> On the architecture point of view, I'd say that if we want clients to
> make efficient use of pipelining, we should only bother them with the
> connections they're manipulating, it should not be end-to-end, because
> they don't care what's on the other side of the proxies and they can't
> do anything about that.

Pipelining can certainly be hop-by-hop, but head-of-line blocking is most often caused by the origin server. Therefore it's important to give it some control over the use of pipelining. 


> At minima, the header should be announced in the Connection header and
> be emitted by any intermediary. That could ensure that the intermediary
> closest to the client has the final word and that the client reliably
> knows what it can do. It would also help a lot with the URL rewriting
> issues, because most components involved in rewriting URLs are reverse
> proxies. They would delete the header on the server side and rewrite it
> on the client side.

This would require that intermediaries be rewritten and redeployed. I think your analysis WRT incentives is flawed; IME the majority of proxy administrators don't care about fine-tuning latency, they care about controlling access and/or reducing bandwidth use. 

[snip]

> Also, one point I'm thinking about. I noticed a situation where pipelining
> did not bring any advantage over the fact that some sites use a large number
> of domain names for their hosts. The reason is that the first request to a
> server is not pipelined, so if a client has to fetch 100 objects over 50
> connections, it does only 2 non-pipelined requests over each (and I'm not
> making up numbers, I've seen that).
> 
> Ideally we should find a solution so that a proxy (explicit or transparent)
> can indicate to a client that it supports pipelining for whatever site the
> client wants to access. That way the client will be able to make effective
> use of its connections to pipeline all requests, starting from the first
> ones. I think that doing so with an explicit proxy configuration is easy,
> we could say that if a client is configured to use a proxy and it gets an
> Assoc-Req response header, then it knows that all connections to the same
> proxy can be pipelined. (For the transparent case, it's tougher because I
> see no way to tell the client without risking to pass such an information
> from a site's proxy to a proxy-less client).
> 
> Well, that was a long mail, but I'd really like that you take some time to
> think about this approach. It remains compatible with your design without
> some of its drawbacks and with faster adoption.


Thanks for the feedback, Willy.

--
Mark Nottingham   http://www.mnot.net/

Received on Monday, 14 March 2011 23:49:22 UTC