- From: Zhong Yu <zhong.j.yu@gmail.com>
- Date: Fri, 5 Oct 2012 11:46:58 -0500
- To: "Roy T. Fielding" <fielding@gbiv.com>
- Cc: HTTP Working Group <ietf-http-wg@w3.org>
On Thu, Oct 4, 2012 at 2:21 AM, Roy T. Fielding <fielding@gbiv.com> wrote: > On Oct 3, 2012, at 11:54 PM, Zhong Yu wrote: >> On Thu, Oct 4, 2012 at 1:31 AM, Roy T. Fielding <fielding@gbiv.com> wrote: >>> On Oct 3, 2012, at 10:04 PM, Zhong Yu wrote: >>> >>>> When a request contains a Range header, it specifies a (byte) range of >>>> the representation body. However, the server doesn't know which >>>> representation the client is talking about. >>> >>> The selected representation. >>> >>>> Here is an example of firefox failing to resume download a gzip-ed body: >>>> >>>> request 1 >>>> >>>> GET / HTTP/1.1 >>>> Accept-Encoding: gzip, deflate >>>> >>>> response 1 >>>> >>>> HTTP/1.1 200 OK >>>> Accept-Ranges: bytes >>>> Content-Encoding: gzip >>>> ETag: "135e962713f.gz" >>>> Last-Modified: Tue, 06 Mar 2012 19:00:37 GMT >>>> Content-Length: 182,249,279 >>>> >>>> Firefox decompress the body on the fly, and saves the decompressed >>>> content to disk. >>>> >>>> Now pause the download, firefox has 68,712,649 bytes decompressed data on disk. >>>> >>>> Now resume the download, firefox tries to request range [68,712,649-] >>>> of uncompressed body >>> >>> That's would be a bug in Firefox. Are you sure it does that? >> >> In which way this is a bug? How should Firefox behave? > > As I explained, it should be caching the original message and > making range requests based on that -- not based on arbitrary > decompressed disk files. > >>> Please tell me you just made up these examples -- there are no commas >>> allowed in Content-Length and range specifiers. >>> >>>> request 2 >>>> GET / HTTP/1.1 >>>> Accept-Encoding: gzip, deflate >>>> Range: bytes=68,712,649- >>>> If-Match: "135e962713f.gz" >>>> If-Unmodified-Since: Tue, 06 Mar 2012 19:00:37 GMT >>>> >>>> response 2 >>>> >>>> HTTP/1.1 206 Partial Content >>>> Accept-Ranges: bytes >>>> Content-Range: bytes 68,712,649-182,249,278/182,249,279 >>>> Content-Encoding: gzip >>>> ETag: "135e962713f.gz" >>>> Last-Modified: Tue, 06 Mar 2012 19:00:37 GMT >>>> Content-Length: 113,536,630 >>>> >>>> Unfortunately the server has no idea that the range is for the >>>> uncompressed body. It returns the range of the gzip-ed body, which >>>> seems to be the best choice. Then firefox fails since it expects >>>> uncompressed body. >>>> >>>> Is the server at fault here? Is there an understanding that Range is >>>> always for the "plain" body without any Content-Encoding? >>> >>> The server is correct. The UA would be broken. >>> >>> Range is defined in terms of the entity-body (RFC2616) and the >>> representation body (p2, p5). In both cases, the spec is clear >>> that Content-Encoding is part of that body, though we could add >>> more text to p5 to make that relationship clearer. >>> >>> Transfer-Encoding is applied after the body. That is, in fact, >>> the main reason Transfer-Encoding was defined -- C-E doesn't >>> work well for on-the-fly operations. A UA cannot combine >>> on-the-fly decompression of C-E with range requests unless it >>> is retaining the original message in cache. >> >> At least Firefox doesn't send "TE" header. Any idea how many UAs >> support response "Transfer-Encoding: gzip"? > > Opera and a few command-line clients, that I know of. It has > always been a chicken and egg problem to get T-E deployed. > >> Another confusion: if Content-Type=multipart/byteranges, >> Content-Encoding=gzip, what is gzip-ed exactly? Is the message body >> >> gzip( multipart ( range ( plain_body ) ) ) >> >> or >> >> multipart ( range ( gzip (plain_body ) ) ) >> >> or something else? > > The second one. > > As in RFC2616 (I'd quote from p2, but we are just about to push > a new draft), ranges are applied to the entity-body that would be > sent in a normal GET, which in turn consists of: > > 7.2.1 Type > > When an entity-body is included with a message, the data type of that > body is determined via the header fields Content-Type and Content- > Encoding. These define a two-layer, ordered encoding model: > > entity-body := Content-Encoding( Content-Type( data ) ) But 206 and Content-Type=multipart/byteranges violates this pattern; that should be specially noted in httpbis p2 section 3. > > Content-Type specifies the media type of the underlying data. > Content-Encoding may be used to indicate any additional content > codings applied to the data, usually for the purpose of data > compression, that are a property of the requested resource. There is > no default encoding. > > This is easier to describe in httpbis p2, right now, because > we separated entity into two distinct things: payload (what is in > a message) and representation (the content on which the message > payload is based). The Range header field in p5 is still a bit > opaque on the topic. > > ....Roy >
Received on Friday, 5 October 2012 16:47:26 UTC