Re: Ambiguity in the Range header

On Thu, Oct 4, 2012 at 1:31 AM, Roy T. Fielding <fielding@gbiv.com> wrote:
> On Oct 3, 2012, at 10:04 PM, Zhong Yu wrote:
>
>> When a request contains a Range header, it specifies a (byte) range of
>> the representation body. However, the server doesn't know which
>> representation the client is talking about.
>
> The selected representation.
>
>> Here is an example of firefox failing to resume download a gzip-ed body:
>>
>> request 1
>>
>>  GET / HTTP/1.1
>>  Accept-Encoding: gzip, deflate
>>
>> response 1
>>
>>  HTTP/1.1 200 OK
>>  Accept-Ranges: bytes
>>  Content-Encoding: gzip
>>  ETag: "135e962713f.gz"
>>  Last-Modified: Tue, 06 Mar 2012 19:00:37 GMT
>>  Content-Length: 182,249,279
>>
>> Firefox decompress the body on the fly, and saves the decompressed
>> content to disk.
>>
>> Now pause the download, firefox has 68,712,649 bytes decompressed data on disk.
>>
>> Now resume the download, firefox tries to request range [68,712,649-]
>> of uncompressed body
>
> That's would be a bug in Firefox.  Are you sure it does that?
> Please tell me you just made up these examples -- there are no commas
> allowed in Content-Length and range specifiers.

The commas are added by me for readability; other than that the
headers are all real.

After pausing the download, I checked the download folder, the partial
file is in the decompressed format, of length L. When resuming the
download, firefox sends "Range: L-", so I assume the range is intended
for the plain body.

>> request 2
>>  GET / HTTP/1.1
>>  Accept-Encoding: gzip, deflate
>>  Range: bytes=68,712,649-
>>  If-Match: "135e962713f.gz"
>>  If-Unmodified-Since: Tue, 06 Mar 2012 19:00:37 GMT
>>
>> response 2
>>
>>  HTTP/1.1 206 Partial Content
>>  Accept-Ranges: bytes
>>  Content-Range: bytes 68,712,649-182,249,278/182,249,279
>>  Content-Encoding: gzip
>>  ETag: "135e962713f.gz"
>>  Last-Modified: Tue, 06 Mar 2012 19:00:37 GMT
>>  Content-Length: 113,536,630
>>
>> Unfortunately the server has no idea that the range is for the
>> uncompressed body. It returns the range of the gzip-ed body, which
>> seems to be the best choice. Then firefox fails since it expects
>> uncompressed body.
>>
>> Is the server at fault here? Is there an understanding that Range is
>> always for the "plain" body without any Content-Encoding?
>
> The server is correct.  The UA would be broken.
>
> Range is defined in terms of the entity-body (RFC2616) and the
> representation body (p2, p5).  In both cases, the spec is clear
> that Content-Encoding is part of that body, though we could add
> more text to p5 to make that relationship clearer.
>
> Transfer-Encoding is applied after the body.  That is, in fact,
> the main reason Transfer-Encoding was defined -- C-E doesn't
> work well for on-the-fly operations.  A UA cannot combine
> on-the-fly decompression of C-E with range requests unless it
> is retaining the original message in cache.
>
> ....Roy

Received on Thursday, 4 October 2012 06:40:55 UTC