Re: Ambiguity in the Range header

On Oct 3, 2012, at 10:04 PM, Zhong Yu wrote:

> When a request contains a Range header, it specifies a (byte) range of
> the representation body. However, the server doesn't know which
> representation the client is talking about.

The selected representation.

> Here is an example of firefox failing to resume download a gzip-ed body:
> 
> request 1
> 
>  GET / HTTP/1.1
>  Accept-Encoding: gzip, deflate
> 
> response 1
> 
>  HTTP/1.1 200 OK
>  Accept-Ranges: bytes
>  Content-Encoding: gzip
>  ETag: "135e962713f.gz"
>  Last-Modified: Tue, 06 Mar 2012 19:00:37 GMT
>  Content-Length: 182,249,279
> 
> Firefox decompress the body on the fly, and saves the decompressed
> content to disk.
> 
> Now pause the download, firefox has 68,712,649 bytes decompressed data on disk.
> 
> Now resume the download, firefox tries to request range [68,712,649-]
> of uncompressed body

That's would be a bug in Firefox.  Are you sure it does that?
Please tell me you just made up these examples -- there are no commas
allowed in Content-Length and range specifiers.

> request 2
>  GET / HTTP/1.1
>  Accept-Encoding: gzip, deflate
>  Range: bytes=68,712,649-
>  If-Match: "135e962713f.gz"
>  If-Unmodified-Since: Tue, 06 Mar 2012 19:00:37 GMT
> 
> response 2
> 
>  HTTP/1.1 206 Partial Content
>  Accept-Ranges: bytes
>  Content-Range: bytes 68,712,649-182,249,278/182,249,279
>  Content-Encoding: gzip
>  ETag: "135e962713f.gz"
>  Last-Modified: Tue, 06 Mar 2012 19:00:37 GMT
>  Content-Length: 113,536,630
> 
> Unfortunately the server has no idea that the range is for the
> uncompressed body. It returns the range of the gzip-ed body, which
> seems to be the best choice. Then firefox fails since it expects
> uncompressed body.
> 
> Is the server at fault here? Is there an understanding that Range is
> always for the "plain" body without any Content-Encoding?

The server is correct.  The UA would be broken.

Range is defined in terms of the entity-body (RFC2616) and the
representation body (p2, p5).  In both cases, the spec is clear
that Content-Encoding is part of that body, though we could add
more text to p5 to make that relationship clearer.

Transfer-Encoding is applied after the body.  That is, in fact,
the main reason Transfer-Encoding was defined -- C-E doesn't
work well for on-the-fly operations.  A UA cannot combine
on-the-fly decompression of C-E with range requests unless it
is retaining the original message in cache.

....Roy

Received on Thursday, 4 October 2012 06:32:22 UTC