New issue: p5-range 5.4.2, proxy recommendations regarding 200 responses to Range

The recommendation that proxies should only return the requested range
to the client when receiving a 200 response from the server has a some
undesired network effects.

The text currently reads:

p5-range
5.4.2.  Range Retrieval Requests

   If a proxy that supports ranges receives a Range request, forwards
   the request to an inbound server, and receives an entire entity in
   reply, it SHOULD only return the requested range to its client.  It
   SHOULD store the entire received response in its cache if that is
   consistent with its cache allocation policies.

Proposed change:

Reduce "SHOULD only return" to a MAY level requirement, and remove the
caching part (covered elsewhere), making the text read


   If a proxy that supports ranges receives a Range request, forwards
   the request to an inbound server, and receives an entire entity in
   reply, it MAY return only the requested range to its client.

And probably this should be restricted to 200 responses, and reminding
that "supports ranges" also involve processing conditionals as suitable.


The intentions of the original text is to optimize the last hop to the
client, but unfortunately it has some quite noticeable bad effectsand
often do not make sense to implement as specified (SHOULD).

Some examples:

      * A client making a range request for the last 200 bytes of a 8TB
        file. As the response "never" comes the client usually times
        out.
      * "download accelerators" accelerating the problem by making many
        Range requests for different parts and as the proxy is masking
        the problem these "download accelerators" have no chance of
        realizing things have gone "bad".
      * Guaranteed extra network load if the resulting object is not
        cachable (or when there is no cache in the proxy). Many times
        the client do really intend to request the rest a little later,
        and if the object is not cached this results in yet another full
        download of the object by the proxy.
      * Bandwidth allocation policy. It's relatively easy to implement a
        reasonable bandwidth policy by downloading objects at about the
        same rate those can be delivered to the requesting client, but
        far from trivial to select a suitable download rate when only
        spooling the data into cache with no client waiting for the data
        currently received.

We (Squid) originally had Range implemented as specified, but due to
this frequently causing more bandwidth issues and confusion than it
helped we changed the implementation many years ago to by default NOT
implement Range ourselves when getting a 200 response to a forwarded
Range request, only implementing Range on cached responses.


Regarding the "SHOULD store the entity" part this is highly redundant
with other parts of the specification and do not really add anything to
be specified here, just confusion making one think that there is
something special with how 200 responses to a Range request should be
cached differently from 200 responses to non-Range requests.

Regards
Henrik

Received on Friday, 31 July 2009 09:37:18 UTC