Re: p1-message-07 S 7.1.4 from Adrien de Croy on 2009-07-20 (ietf-http-wg@w3.org from July to September 2009)

From: Adrien de Croy <adrien@qbik.com>
Date: Mon, 20 Jul 2009 18:08:16 +1200
To: Mark Nottingham <mnot@mnot.net>
CC: Henrik Nordstrom <henrik@henriknordstrom.net>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <4A6409D0.2060108@qbik.com>
Mark Nottingham wrote:
> In defence of the current text, it's hardly a philosophical 
> requirement; because of the way that TCP works and the way that the 
> Internet was deployed circa 1996, this was a very real and serious 
> concern.
>
I understand the concern, and agree at the time, with the technology and 
bandwidth available at the time, it may have been seen to be desirable.

However with the benefit of hindsight, and actually seeing what impact 
this actually had (lots of workarounds) we may come to a different 
conclusion.

As far as link bandwidth is concerned, there's not much difference 
(connection setup overhead only) between many or few connections.

So the problem places come down to certain types of intermediaries (e.g. 
NAT devices or proxies) and servers.  Many proxies already can limit the 
number of connections from a client.  Many servers also have load 
control capability.

So I think the problem is adequately covered already. 

I think the best place to manage demand / load is at the server not the 
client.  If there were to be anything in the protocol to assist this, 
that would be a better thing in my view to focus on.

There are several options for such things,

* a busy - try again later status (like SMTP 421) (actually this would 
be exceedingly useful for rate limiting connections - a try again with a 
retry-after header - maybe new status 309?).
* an advertisement for the number of connections a server will accept 
from a client
* something else

However I think these could risk over-complicating the issue for dubious 
benefit.  Are we even clear on what it is desired to limit?

Limiting number of connections may not be the issue for a server.  It 
may be something like bandwidth, or CPU or disk IO.

As it currently stands, we have an arbitrary limit on an arbitrary 
metric to solve a problem we don't know exists.  We know however it does 
create certain problems.

> I think that the limit needs to be raised at the least, and perhaps 
> softened in other ways, because:
>   - it assumes that pipelining is usable over the open Internet, when 
> often it is not
>   - congestion on Internet backbones isn't *as much* of an issue as it 
> was then (AFAIK; I'll defer to others here)
>   - Web pages are significantly more complex (i.e., more resources)
>
> This last point is particularly important; in some instances you can 
> see a "waterfall" of requests, two at a time, in pathological 
> circumstances (e.g., accessing a Web page that references 50 images, 
> all on the same host, across an ocean).
I've seen this many many times.  Even accessing an idle test server 
(over GB LAN) via a proxy was hugely slowed down loading many images due 
to the 2 connection limit in IE7.  It makes sites slow, whether the 
sites are overloaded or idle. 

>
> The result is that it's now common practice to deploy assets on 
> multiple hosts just to avoid this limitation, and JavaScript library 
> developers are starting to look at ways of bundling multiple responses 
> into one, thereby tunnelling through HTTP and making the messages 
> opaque. I'd say both are signs that there needs to be a change.
the latter is of concern.  Esp wrt caching.

>
> OTOH, I also think completely removing limitations isn't good practice 
> either, because there are still networks out there where congestion is 
> a problem, and having an app open multiple TCP connections (as many 
> "download accelerators" do) to hog resources isn't good for the 
> long-term health of the Internet either.
even a download accelerator that opens dozens of connections isn't 
necessarily a problem.

It's kinda like market-driven economics vs socialism.

If the supplier can't keep up with demand, they have the option to 
increase supply.  Do we want to take away that option by choking the 
clients?

I guess in the end, this is all only a SHOULD level recommendation.  
Maybe also then add "clients that implement a connection limit SHOULD 
also provide a mechanism to configure the limit".

Cheers

Adrien

>
> My personal preference would be to:
>   - raise the recommended limit to something like 6 or 8 simultaneous 
> connections (I believe there's been some research referenced 
> previously that shows these numbers to be reasonable), and
>   - explain a bit more of the motivation/tradeoffs in the text, and
>   - allow servers to explicitly relax this requirement to clients for 
> particular uses (perhaps in an extension; this has been discussed a 
> bit on the hybi list and elsewhere), and
>   - long-term, look at ways to work around this problem in a better 
> way (e.g., the effort to run HTTP over SCTP).
>
> Cheers,
>
>
>
>
> On 20/07/2009, at 7:35 AM, Adrien de Croy wrote:
>
>>
>>
>> Henrik Nordstrom wrote:
>>> lör 2009-07-18 klockan 15:46 +1200 skrev Adrien de Croy:
>>>
>>>
>>>> But any client that only made 2 connections to a proxy would be 
>>>> quickly
>>>> dumped by users as basically unusable.  I think this para should be 
>>>> taken out.
>>>>
>>>
>>> The user-agent limit is per accessed server (host component). There is
>>> no specified limit client->proxy when the client accesses multiple 
>>> sites
>>>
>>
>> maybe need clarification then in 7.1.4, since it currently reads:
>>
>> "A single-user client SHOULD NOT maintain more than 2 connections 
>> with any server or proxy."
>>
>>
>>> The limits are there to prevent unintentionall congestion and
>>> unfairness. Both of the server resources and network.
>>>
>>
>> It doesn't make any sense to me to try and address that issue in the 
>> protocol.
>>
>> It's like we're defining a recipe for bread, and in the recipe we 
>> state nobody should own more than 2 loaves.
>>
>> That is based on a whole series of assumptions about availability and 
>> capacity of resource, which even if they were valid now, will not be 
>> valid for all time.  Let alone the philisophical problems about 
>> whether HTTP should be trying to control that anyway.
>>
>> Surely its up to a site to cater for the demand it gets.  Whether 
>> that's 1 million clients each making 2 connections, or 100000 clients 
>> each making 20 connections.
>>
>> It's up to ISPs to cater for demand they get.
>>
>> Putting that should level requirement in the protocol has achieved 
>> only one thing:  problems for site designers coping with UAs that 
>> take it to heart.
>>
>> Thankfully most UAs now ignore it.  The more UAs that ignore it, the 
>> fewer hoops site owners and authors will have to jump through to get 
>> around it.
>>
>>> But yes, it's frequently a problem for certain types of sites, and the
>>> workaround of using multiple sitenames isn't exactly clean.
>>>
>>> I would propose adding a way where the HTTP server can grant it's
>>> clients to use more than 2 connections.
>>>
>>>
>>
>> Servers already send back an error page when they are overloaded.
>>
>> Servers are free to limit connections in any way they wish.
>>
>> Putting an arbitrary limit into the client can only do one thing - 
>> reduce user experience, something most site owners would rather not, 
>> which is why they set up so many different site names to get around 
>> the problem.
>>
>> It also forces people to host on faster connections, because 
>> otherwise their lightly loaded site seems slow.
>>
>>>> Furthermore the requirements that the second part places on a proxy
>>>> would greatly increase the complexity of the proxy, since it would 
>>>> then have to
>>>> start multiplexing requests from different client connections over 
>>>> the same server
>>>> connections.
>>>>
>>>
>>> Why? The proxy is allowed to open two times the amount of client
>>> connections it sees with requests for the site.
>> that's not how I read the following:
>>
>> "A proxy SHOULD use up to 2*N connections to another server or proxy, 
>> where N is the number of simultaneously active users."
>>
>> I read that as 2 connections per connected client.  Since the client 
>> can have 2 connections each, that's only 1 connection per client 
>> connection.  That wouldn't require multiplexing, but is a proxy 
>> deemed to be a client in this case?  In which case it can only have 2 
>> connections to any 1 server regardless of the number of connected 
>> clients.
>>
>>
>>> In reality this means
>>> that the proxy do not need to bother much about the limit unless it's
>>> proactively opening connections to a server or not pooling idle
>>> proxy<->nexthop persistent connections for reuse but randomly
>>> distributes client requests on many workers each with their own set of
>>> next-hop connections...
>>>
>> some architectures don't lend themselves to that, esp those with 
>> filtering interfaces, where a filter may block execution for an 
>> indeterminate period.
>>
>> Regards
>>
>> Adrien
>>
>>> Regards
>>> Henrik
>>>
>>>
>>>
>>
>> -- 
>> Adrien de Croy - WinGate Proxy Server - http://www.wingate.com
>>
>>
>
>
> -- 
> Mark Nottingham     http://www.mnot.net/
>
>

-- 
Adrien de Croy - WinGate Proxy Server - http://www.wingate.com
Received on Monday, 20 July 2009 06:05:27 UTC