RE: Content negotiation for request bodies from Brian Smith on 2008-02-24 (ietf-http-wg@w3.org from January to March 2008)

From: Brian Smith <brian@briansmith.org>
Date: Sun, 24 Feb 2008 11:02:24 -0800
To: "'HTTP Working Group'" <ietf-http-wg@w3.org>
Message-ID: <002601c87717$cb317b70$6401a8c0@T60>
Adrien de Croy wrote:
> Brian Smith wrote:
> > One question I have frequently seen asked is "How do I know if the 
> > server supports compressed request bodies in POST and PUT requests?" 
> > Often the answer is "don't even try" or "try it, and if it fails, 
> > try without the compression." A better answer may be "Send the 
> > request, with the headers
> > 'Content-Encoding: deflate' and 'Expect: 100-continue', and if you 
> > get a 100-continue response, send the compressed request body; 
> > otherwise send a new request without content-coding."

(quoted out of order)

> The problem with Expects, is it's only specified in HTTP 1.1, and 
> every link in the chain must support it for it to work.  In the 
> special case of
> Expects: 100-continue, this has grave problems with proxies that I've 
> raised before, and a heuristic timeout waiting for a 100-continue 
> which may never arrive causes further problems.

There is a lot of software generating "Expect: 100-continue" now. In particular, the .NET client libraries seem to add this by default. Curl uses it by default. Apache supports it well. If there is any problem, it would seem to lie in with proxies. Is "Expect: 100-continue" really that problematic for commonly deployed proxies?

> I don't see much difference between "try it, and if it fails..." and 
> sending an Expects header.  Don't they amount to the same thing?

> I guess the difference is in the amount of resource that may have 
> already been sent before a failure comes back

The other difference is that the client can implement a policy "if the server is too dumb to understand Expect: 100-continue, then it is also probably too dumb to understand Content-Encoding or other advanced features I would like to use in this request, so let's back off and use something simpler." In particular, the client might assume that the lack of a 100-continue response means that there is an ancient (HTTP 1.0) proxy or server involved somewhere.

> Balance that against the time
> you'd have to wait before giving up on getting a 100-continue back if 
> the server were only HTTP/1.0, keeping in mind connecting to a proxy 
> may be a lot quicker than the proxy connecting upstream.
>
> A lot of work has gone into HTTP to try and cut down on round-trips - 
> requests back to the server for whatever reason.  This has meant some 
> compromises have been made especially when it comes to clients sending 
> message bodies to servers.

A smart client can use the information it learns from an initial unsuccessful request in future, similar requests. For example, if a deflated request fails, but the same request succeeds without the content-encoding, then the client can avoid using deflated request bodies and/or "Expect: 100-continue" in similar requests for some time period. 

> The result is that HTTP does not contemplate negotiating content 
> transfers from clients to servers, only the other way around.  This is 
> not such a problem for direct client-server comms, but when you 
> introduce proxies in the chain, the problem is exascerbated.
> 
> Keeping in mind that there are many proxies potentially in a chain you 
> don't know about.  Company intercepting proxies, ISP intercepting 
> proxies, and a reverse proxy through to an origin server.  It's not 
> uncommon to have about 3 proxies in a chain.

There are plenty of places in the protocol where we can help with negotiation. For example, when the server responds with a "415 Unsupported Media Type", right now the client can only keep trying to resend different types (e.g. first image/jpg, then image/png, then image/gif; text/html;charset=TIS-620, then text/html;charset=UTF-8). But, if the server could add an "Accept: image/gif" or "Accept: text/html;charset=UTF-8" to its 4xx responses, then we can avoid some roundtrips.

If the client is a little clever it can avoid such performance problems by remembering the server's limitations, and skipping right to the content type/charset/content-encoding that the server accepted in previous, similar requests.

- Brian
Received on Sunday, 24 February 2008 19:02:31 UTC