Re: Fwd: I-D Action:draft-nottingham-http-pipeline-00.txt from Willy Tarreau on 2010-08-10 (ietf-http-wg@w3.org from July to September 2010)

From: Willy Tarreau <w@1wt.eu>
Date: Tue, 10 Aug 2010 07:35:08 +0200
To: Mark Nottingham <mnot@mnot.net>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <20100810053508.GD22512@1wt.eu>
Hi Mark,

On Tue, Aug 10, 2010 at 11:39:56AM +1000, Mark Nottingham wrote:
> FYI. I see this as the start of a discussion more than anything else.
(...)
> http://www.ietf.org/internet-drafts/draft-nottingham-http-pipeline-00.txt

I found interesting things in this memo (I never realized that 1xx
responses were causing difficulties for pipelined responses). I'm also
realizing that it may be the beginning for the possibility for out-of-order
responses. But I think that copying the full request into a response header
will have some drawbacks which will still limit its adoption :

  - it will noticeably increase response length for some static servers
    which usually return small objects, or even many 304 responses.

  - it forces the server to keep a copy of the URI during the processing.
    While with most application servers this is not a problem, this can
    be one on static servers which otherwise don't have to keep a buffer
    allocated once they start to respond. Also, depending on the URI
    length, performing the copy itself might add a few more CPU cycles to
    process the response (similar to large cookies in fact).

  - on servers, this will have to be done unconditionally eventhough at
    the beginning a very few number of clients will consider it. This
    further delays adoption because there is no perceived added value on
    the server side, and adding some processing cost for a little minority
    of clients means that it will be long before we find it everywhere.

  - it's common for some gateways (caches, etc...) to rewrite parts of
    URLs when forwarding requests to servers. Sometimes, a path prefix
    is changed, sometimes they just perform some normalization by
    resolving unneeded %-encoding, etc... In that case, the server will
    return a response that does not match the client's request. Until
    the gateways make the rewrite rules configurable by the admin, this
    will simply break clients which rely on this behaviour and it may
    result in this option being disabled by default in browsers.

I think that a request identifier could be a lot better (I noticed the
point about it at the end but don't agree :-)).

In my opinion, what is important is an identifier related to the
connection, since the problem we're having is that we need to distinguish
several requests/responses on a same connection. So basically a client
would send a request counter with each request over the same connection,
that the server would simply echo. The advantages I see with this method :

  - smaller for servers, and they don't have to emit a response if they
    don't see the request header ;

  - it is immediately deployable, client adoption rate will be perceived
    from the servers, which will incite them to adopt it too ;

  - does not break with existing intermediaries : if they don't support
    it, it's simply just as before, and we rely on their good job at
    switching the right request to the right server and send back the
    corresponding response. Of course there is a small risk that a client
    receives a response with its ID for a request from someone else, but
    when this happens, it means the intermediary is already broken and was
    doing that from the beginning. Also, the client will be able to detect
    that at least *some* responses do not match, and emit an error message
    indicating the user that there is something broken with the server on
    that site.

  - helps intermediaries participate : those which perform multiplexing
    will be able to easily translate request/response numbers from client-
    side and server-side connections, maintaining the correct matching for
    the client whatever rewrite has been performed ;

  - generalizing this will also for the first time bring the ability for a
    client to detect that a response is not the one it had expected. It
    still happens from time to time to get an unwanted response from broken
    multiplexers even without pipelining, and this request ID can fix that.

Since the counter is related to the connection, it should be announced in
the Connection header so that each intermediate can apply its own counter
on the other side. It will also make it easy for clients to detect that
their proxy is still not compatible and automatically disable pipelining
when the header is not present in responses.

So, to sum up :

  client to server :
     GET /foo.css HTTP/1.1
     Host: www
     Request-Id: 1
     Connection: Request-Id
  
     GET /foo.png HTTP/1.1
     Host: www
     Request-Id: 2
     Connection: Request-Id
  
     GET /foo.js HTTP/1.1
     Host: www
     Request-Id: 3
     Connection: Request-Id
  
  server to client :
     HTTP/1.1 200 OK
     Request-Id: 1
     Connection: Request-Id
  
     HTTP/1.1 200 OK
     Request-Id: 2
     Connection: Request-Id
  
     HTTP/1.1 200 OK
     Request-Id: 3
     Connection: Request-Id
  

And if connection multiplexing intermediaries are present, we can get this :

  client to proxy :
     GET http://example.com/foo.css HTTP/1.1
     Host: example.com
     Request-Id: 1
     Connection: Request-Id
  
     GET http://example.com/foo.png HTTP/1.1
     Host: example.com
     Request-Id: 2
     Connection: Request-Id
  
     GET http://example.com/foo.js HTTP/1.1
     Host: example.com
     Request-Id: 3
     Connection: Request-Id
  
  proxy to server:
     GET /foo.css HTTP/1.1
     Host: example.com
     Request-Id: 11
     Connection: Request-Id
  
     GET /foo-client2.png HTTP/1.1    <- from another client
     Host: example.com
     Request-Id: 12
     Connection: Request-Id
  
     GET /foo.png HTTP/1.1
     Host: example.com
     Request-Id: 13
     Connection: Request-Id
  
     GET /foo.js HTTP/1.1
     Host: example.com
     Request-Id: 14
     Connection: Request-Id
  
  server to proxy:
     HTTP/1.1 200 OK
     Request-Id: 11
     Connection: Request-Id
     Content-length: 10
     ...
  
     HTTP/1.1 200 OK
     Request-Id: 12
     Connection: Request-Id
     Content-length: 20
     ...
  
     HTTP/1.1 200 OK
     Request-Id: 13
     Connection: Request-Id
     Content-length: 30
     ...
  
     HTTP/1.1 200 OK
     Request-Id: 14
     Connection: Request-Id
     Content-length: 40
     ...
  
  proxy to client :
     HTTP/1.1 200 OK
     Request-Id: 1
     Connection: Request-Id
     Content-length: 10
  
     HTTP/1.1 200 OK
     Request-Id: 2
     Connection: Request-Id
     Content-length: 20
  
     HTTP/1.1 200 OK
     Request-Id: 3
     Connection: Request-Id
     Content-length: 40
  
Any opinion ?

Thanks,
Willy
Received on Tuesday, 10 August 2010 05:35:42 UTC