Re: update: http progress notification from Adrien de Croy on 2010-05-28 (ietf-http-wg@w3.org from April to June 2010)

From: Adrien de Croy <adrien@qbik.com>
Date: Fri, 28 May 2010 12:49:45 +1200
To: "Roy T. Fielding" <fielding@gbiv.com>
CC: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <4BFF1329.5090806@qbik.com>
On 28/05/2010 12:21 p.m., Roy T. Fielding wrote:
> On May 27, 2010, at 4:40 PM, Adrien de Croy wrote:
>    
>> actually if you're behind a corporate firewall that does this, it's 
>> closer to a 100% chance depending on scanning rules for the firewall 
>> AV. When we deploy this with WinGate and AV, that will then be 
>> millions of browsers (and other HTTP agents) seeing these 103s for 
>> pretty much all requests if we don't negotiate. I've tested Chrome, 
>> FF and IE8, but I haven't tested things like windows update, BITS 
>> service etc etc etc. There's no way to guarantee what won't break.
> Again, that is a very tiny percentage of agents when compared to the
> rest of the Internet traffic and needs to be crossed with an even
> smaller percentage of resources that would benefit from a progress feature.
>    

OK fair enough.

> I see no reason for such a small instance of potential benefit to
> impose a constant cost.  Services like Windows Update can be easily
> bypassed (why you would want to scan such a service is beyond me,
> since it often delivers virus signatures inside virus-checking
> client tools).
>    

OK also fair.  Actually default config for many AV filtering products 
excludes many known update sites for this reason.

>> I was of the impression switching behaviour based on User-Agent was frowned upon.  It creates at best a maintenance issue for admins (keeping the list of UAs current).
>>      
> You don't have to keep a list of current agents.  You have to keep
> a list of broken agents that do not properly handle HTTP/1.1 status
> codes, and you can make that configurable.  Apache does that kind of
> thing all the time.  It might even be the empty list, for all we know.
>
>    

that's what I meant.  Depending on how it works, it's either a whitelist 
or blacklist of UAs.  Hard to predict at this stage whether a white or 
black list would be more work to maintain.

>> I agree it will be a long time before origin servers generate them.  So probably it makes no sense for now for a proxy to forward the Progress token upstream, unless it's talking to another proxy.  So actually I don't see many of these going over the wire onto the net. The main benefit is for a client talking to a local proxy.  RFC2616-compliant proxies that don't understand Progress will already strip it.
>>      
> So you would only have UAs send it when configured to use a proxy?
>    

that's an option although not my personal preference.

>> I don't know of any browser that doesn't already include a Connection header in every request already for keep-alives (even through default semantics for 1.1 is to keep alive so it's largely redundant), so overhead in practise will only be an extra 10 bytes.
>>
>> Also, clients could make some choices about likelihood of benefit before adding Progress to the Connection header.  E.g. for retrieving images, JS, CSS etc it could be turned off.
>>      
> Anything that relies on the client guessing the nature of a resource
> is not a good plan.  Images, for example, are often generated, converted,
> or cropped on the fly.
>    

Agreed.  I guess the point was that the client can still choose.  If 
it's on always, then the client can't turn it off if it may need to 
(unless we add a No-Progress token).  Why a client may wish to turn it 
off I can't imagine yet, but that doesn't mean there will never be a 
need to.

>> Or behaviour could differ for proxy connections vs direct and be configurable in the browser.
>>
>> Otherwise if we don't advertise support, and it's on by default, we need to resort to things like administrative settings, e.g. admin can turn it off/on per client IP or UA or Content-Type or something.  That loads up burden on the admin.
>>      
> The number of scan-proxy-maintaining admins or long-resource-processing
> owners is far smaller than the groups effected by the other options.
> The cost should be placed on those benefitting most from the feature.
>    

The problem with that for me, is that there's a high correlation between 
those people and my customer-base...

>> As for Henrik's point about benefit for non-supporting clients, where these messages will stop connection timeouts.  I can understand a client may consider something is happening, but if there is no progress indicated to the user, we are back at square one (where we are now), where the user hits retry a few times then calls his tech support or admin.  So I don't think the benefit will be there for browsers.  Other agents certainly may benefit.
>>      
> Or you could just send a 202 response if the software anticipates
> a lengthy delay.
>    

How does a browser recover from that to finally obtain the entity?  
RFC2616 is extremely vague on 202.  Is that where you'd post some sort 
of progress page?  Progress pages work great for downloads, but not so 
great for embedded content, and completely useless for non-browser UAs.

Does that BTW mean that the entity returned with a 202 is not the entity 
requested (c.f. recent long-running discussion on how to determine what 
entity a response carries).

Actually I like the idea of just sending the progress messages always, 
for the reason that it will potentially provide a little more incentive 
for UA authors to implement it.

The downside is that since there's no indication from the client of 
support for it, the proxy can't then tell whether it should send 
progress, or revert to previous mechanisms for keeping UAs and their 
human happy (e.g. drip-feeding part of the entity through).  We get a 
fair number of support calls from clients complaining about download 
timeouts and retries due to scanning without dripfeeding.  In fact 
getting rid of the abomination of drip-feeding is my main incentive for 
promoting this.

Adrien

> ....Roy
Received on Friday, 28 May 2010 00:50:26 UTC