Re: Coalescing ranges from Mark Nottingham on 2015-07-25 (ietf-http-wg@w3.org from July to September 2015)

From: Mark Nottingham <mnot@mnot.net>
Date: Sat, 25 Jul 2015 12:29:58 +0200
To: "A. Rothman" <amichai2@amichais.net>
Cc: ietf-http-wg@w3.org
Message-Id: <7FE58DB2-A03D-4DAC-9406-3825C66901E0@mnot.net>
Hi,

The thing to keep in mind here is that just because the spec says you MAY do something, it doesn't imply that you MUST NOT do other things. I.e., we do not operate under "anything which is not expressly allowed is forbidden" rules. 

I agree that this could be written a bit more clearly; the normative MAY is somewhat spurious here.

Cheers,


> On 23 Jul 2015, at 7:56 pm, A. Rothman <amichai2@amichais.net> wrote:
> 
> Hi,
> 
> I'd like to raise an issue with the HTTP 1.1 RFCs regarding when the server can coalesce byte ranges requested by the client.
> 
> In RFC 2616, there's section that says:
> 
> "When an HTTP message includes the content of a single range (for
> example, a response to a request for a single range, or to a request
> for a set of ranges that overlap without any holes), ..."
> 
> which implies that there may be other legitimate cases in which the server may choose to coalesce the ranges, and it's all good.
> 
> However, RFC 7233 says:
> 
> "When multiple ranges are requested, a server MAY coalesce any of the
> ranges that overlap, or that are separated by a gap that is smaller
> than the overhead of sending multiple parts, regardless of the order
> in which the corresponding byte-range-spec appeared in the received
> Range header field.  Since the typical overhead between parts of a
> multipart/byteranges payload is around 80 bytes, depending on the
> selected representation's media type and the chosen boundary
> parameter length, it can be less efficient to transfer many small
> disjoint parts than it is to transfer the entire selected
> representation."
> 
> Notably, it changed from a "for example" to a MAY with very detailed, very specific cases. Now it sounds to me like coalescing for other reasons is off the table.
> 
> I'd like to suggest that there may be other legitimate reasons why a server would want to coalesce the ranges, even if the exact efficiency calculations described in detail above do not hold. Regardless of the reasons for doing so, I think this should be more explicitly allowed by the RFC, or at least implicitly allowed with a 'for example' disclaimer like it used to be, for the simple reason that it would always still be more efficient to return a range (even if it's a single large range) than it would be to ignore the range header and return the whole content with a 200 status, which is explicitly allowed by the spec as an alternative. Disallowing it (even if implicitly, as the current wording might suggest) would prevent such simple but efficient optimizations from being valid, thus making the detailed byte-counting optimization detailed above somewhat... silly (for lack of a better word at the moment :-) ).
> 
> This of course would not affect existing servers, or new servers that want to implement any sort of complex calculations or heuristics on byte range responses (e.g. what is mentioned in the security considerations section), nor would it affect clients in any way since they must check the returned data ranges and handle any of these types of responses anyway.
> 
> If I've misinterpreted the RFC or missed some other section relevant to this issue, I'd be happy if someone can point it out :-)
> 
> Thanks,
> 
> Amichai

--
Mark Nottingham   https://www.mnot.net/
Received on Saturday, 25 July 2015 10:30:28 UTC