Re: New issue: p5-range 5.4.2, proxy recommendations regarding 200 responses to Range from Yves Lafon on 2010-07-23 (ietf-http-wg@w3.org from July to September 2010)

From: Yves Lafon <ylafon@w3.org>
Date: Fri, 23 Jul 2010 08:58:45 -0400 (EDT)
To: Henrik Nordstrom <henrik@henriknordstrom.net>
cc: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <alpine.DEB.1.10.1007230836390.19648@wnl.j3.bet>
On Fri, 31 Jul 2009, Henrik Nordstrom wrote:

> The recommendation that proxies should only return the requested range
> to the client when receiving a 200 response from the server has a some
> undesired network effects.
>
> The text currently reads:
>
> p5-range
> 5.4.2.  Range Retrieval Requests
>
>   If a proxy that supports ranges receives a Range request, forwards
>   the request to an inbound server, and receives an entire entity in
>   reply, it SHOULD only return the requested range to its client.  It
>   SHOULD store the entire received response in its cache if that is
>   consistent with its cache allocation policies.
>
> Proposed change:
>
> Reduce "SHOULD only return" to a MAY level requirement, and remove the
> caching part (covered elsewhere), making the text read
>
>   If a proxy that supports ranges receives a Range request, forwards
>   the request to an inbound server, and receives an entire entity in
>   reply, it MAY return only the requested range to its client.

The text for byte-ranges says:
<<
If a syntactically valid byte-range-set includes at least one 
byte-range-spec whose first-byte-pos is less than the current length of 
the representation, or at least one suffix-byte-range-spec with a non-zero 
suffix-length, then the byte-range-set is satisfiable. Otherwise, the 
byte-range-set is unsatisfiable. If the byte-range-set is unsatisfiable, 
the server SHOULD return a response with a status of 416 (Requested range 
not satisfiable). Otherwise, the server SHOULD return a response with a 
status of 206 (Partial Content) containing the satisfiable ranges of the 
representation.
>>
I don't see why it would be a MAY for a proxy and a SHOULD for a server, 
it seems better to keep them both as SHOULDs.

>   If a proxy that supports ranges receives a Range request, forwards
>   the request to an inbound server, and receives an entire entity in
>   reply, it MAY return only the requested range to its client.
>
> And probably this should be restricted to 200 responses, and reminding
> that "supports ranges" also involve processing conditionals as suitable.
>
>
> The intentions of the original text is to optimize the last hop to the
> client, but unfortunately it has some quite noticeable bad effectsand
> often do not make sense to implement as specified (SHOULD).
>
> Some examples:
>
>      * A client making a range request for the last 200 bytes of a 8TB
>        file. As the response "never" comes the client usually times
>        out.
Is it better to transmit the 8TB back to the client? It probably depends 
on what the purpose of the proxy is.

>      * "download accelerators" accelerating the problem by making many
>        Range requests for different parts and as the proxy is masking
>        the problem these "download accelerators" have no chance of
>        realizing things have gone "bad".
I understand this as "the proxy will get several times the whole request", 
which depends on the proxy implementation.

>      * Guaranteed extra network load if the resulting object is not
>        cachable (or when there is no cache in the proxy). Many times
>        the client do really intend to request the rest a little later,
>        and if the object is not cached this results in yet another full
>        download of the object by the proxy.

Agreed (although same issue as the first *, it may make sense to have the 
proxy consume bandwidth on one side to save on the other).

>      * Bandwidth allocation policy. It's relatively easy to implement a
>        reasonable bandwidth policy by downloading objects at about the
>        same rate those can be delivered to the requesting client, but
>        far from trivial to select a suitable download rate when only
>        spooling the data into cache with no client waiting for the data
>        currently received.
>
> We (Squid) originally had Range implemented as specified, but due to
> this frequently causing more bandwidth issues and confusion than it
> helped we changed the implementation many years ago to by default NOT
> implement Range ourselves when getting a 200 response to a forwarded
> Range request, only implementing Range on cached responses.

Which is a perfectly weighed decision, so in sync with the definition of 
SHOULD.

> Regarding the "SHOULD store the entity" part this is highly redundant
> with other parts of the specification and do not really add anything to
> be specified here, just confusion making one think that there is
> something special with how 200 responses to a Range request should be
> cached differently from 200 responses to non-Range requests.

I completely agree, striking "It SHOULD store the entire received response 
in its cache if that is consistent with its cache allocation policies."

-- 
Baroula que barouleras, au tiéu toujou t'entourneras.

         ~~Yves
Received on Friday, 23 July 2010 12:58:47 UTC