Re: CONNECT message including tunneled data

Adrien de Croy wrote:
> What other HTTP method allows you to send any amount of data back
> and forth not delineated into HTTP messages?

POST :-)

However, since overlapping request/response doesn't work with many
agents, it is more common to issue two POSTS (or 1 POST, 1 GET) on
separate connections, one for the upstream data, and one for the
downstream data.  People do actually use this method already.

> In the context of what the command is, it's "connect", purely and simply 
> to make a connection and wire it up.  It's not "connect and then pipe 
> this data through".  The data on the connection is not in the context of 
> the CONNECT message or response.  That data cannot be processed until 
> the CONNECT command has been completed, it does not form part of that 
> command - therefore it is subsequent data.
> 
> I've just got a feeling that if you start allowing pipelined data to be 
> piggy-backed onto a CONNECT message or its response, bad things will happen.
> 
> Sure, you might save an RTT in some cases, but we need to ensure it 
> doesn't break things.

OH, I'm not advocating changing CONNECT itself.  The method name is
hard-coded into every proxy; the semantics cannot be changed.  Any new
strategy would need to use a new method name, at least.

> >   - Cannot re-use the HTTP connection after the application protocol
> >     has finished with it.
> >  
> that would be impossible anyway - if you wanted to do that you would 
> need to apriori know exactly how much data was going to be sent in both 
> directions so that you could do proper HTTP message delineation.  In 
> some hypothetical cases that might be conceivable, but in real world I 
> don't think it's that useful.

Oh, but you _can_ do that already with standard HTTP.  It's not
complicated.

Just use chunked encoding over POST.

I'm not promising it will work with every agent out there, mind. :-)
But you see the principle is old already.
 
> In any case the other protocol server is going to close the connection 
> once its protocol is done anyway, in which case all you can save here is 
> the client connection to the proxy, which is the least expensive part 
> normally.

Only when the proxy is near the client.  When it's near the server,
the opposite is true.  There certainly are proxies handling CONNECT
(or the logical equivalent using other methods) which aren't, for
moderately good reasons.

> >   - Combination of the above: cannot pipeline multiple application
> >     requests, if they need to use separate connections.  (See this
> >     already with rsync-over-CONNECT).
> >  
> pipelining in this case is surely a function of the protocol that is 
> tunneled over the connection using CONNECT?

At the moment.  When you embed a stream inside a request with chunked
or content-length, as you can with POST (and known software), that can
be a useful way to use HTTP's pipelining to access a non-pipelined
service.

Again, I'm not proposing that CONNECT be changed.  Really, just noting
that experimental protocols are playing with POST and similar
techniques in this sort of way, that it does work, and it's logical.

> >[ However, if there's any interest in developing "next generation"
> >HTTP (which ought to have gracefully degrading long message
> >multiplexing, response reordering, and two-way requests), I would
> >suggest that two-way streaming _inside_ messages would be quite a
> >natural fit for that. ]
>
> OK.  You could even then go for a multi-connection multiplexed 
> connection.  I.e. allow multiple connections to be set up over a single 
> client-proxy connection with IDs, and then packets are addressed 
> according to those IDs.

Yes, that's what I have in mind when I say "long message
multiplexing".  However, it comes with its own issues, particularly
controlling the latency of each stream usefully.

> Do we see the CONNECT command as being something that is growing in 
> popularity though (other than for spammers?).  SOCKS for instance, UPnP, 
> various proprietary systems in general provide a much more flexible 
> firewall traversal mechanism.

Well, HTTP proxies are often available when nothing else is.

E.g. when I visit some random corporate place, there's sometimes a
HTTP proxy and no other access to the net.  However, they are usually
configured only to allow access to port 443 (HTTPS), naturally.  So we
end up using additional tunnelling layers over port 443 to a
cooperating server, if we really need to access something else.  So, I
guess it's not hugely used.

I guess CONNECT is sometimes used for HTTPS, but there doesn't seem to
be much point in that nowadays.  It just pointlessly loads the proxy,
when routing the connection over a NAT would be more sensible.

> One of the main things about the CONNECT command is its simplicity.  
> Changing this in any way I think would reduce its support.

I agree and wouldn't advocating changing CONNECT itself for many reasons.

-- Jamie

Received on Friday, 1 February 2008 00:21:16 UTC