Re: Range requests and content encoding

On 22 December 2015 at 12:28, Martin Thomson <martin.thomson@gmail.com>
wrote:

> RFC 7233 does not mention content encoding at all.  Same for transfer
> encoding.  I assume that is because this is completely unspecified and
> therefore completely unreliable, however, for my sanity...
>
> My reading is that a 206 response includes ranges of the encoded
> message, and that the content-encoding applies to the complete message
> body prior to being split into ranges.  Thus, if I had a "x2" content
> encoding that turned "Hello World!" into "HHeelllloo  WWoorrlldd!!",
> asking for bytes 3-5 would get you "eel" and not "llo".
>
> The text in Section 4.1 suggests that you would not include a
> Content-Encoding header field if the client used If-Range on the
> expectation that they already know.  That seems pretty dangerous, but
> it's consistent with the idea that you are repairing a larger message.
>
> On the other hand, I have to assume that a Transfer-Encoding applies
> *after* the range request.
>
> p.s., I've opened https://github.com/httpwg/http11bis/issues/11 for this.
>
>
​That's been my understanding. C-E can be used to send offline-encoded
files (like .gz archives), so a range request against that representation
of that resource should target a slice of gzip-encoded data. (IOW:
bytes=0-2 of all C-E:gzip resources on the web should be identical.)

And as you say, T-E applies *after* the slicing. That fact (and the absence
of T-E in H2) was what inspired Keith Morgan to start that big discussion a
while back about gzipping sliced-up log files, and lead to
draft-kerwin-http2-encoded-data

Regarding Section 4.1 and If-Range/Range/206: as I understand it,
conditional request conditions are tested *after* the representation is
selected (e.g. RFC 7232, Section 3.1 "...conditional on the selected
representation's modification date..."), which I assume means A-E/C-E has
already been resolved before looking at the If-Range/Range request headers.
If-Range uses the strong comparison function, which should be enough to
guarantee the content encoding*. It might be nice if there was some text in
RFC 7233 that spelled that out a bit better (or even a more explicit
pointer), but I don't know how to word it.

* RFC 7232, Section 3.1: "For example, if
   the origin server sends the same validator for a representation with
   a gzip content coding applied as it does for a representation with no
   content coding, then that validator is weak."


Cheers
-- 
  Matthew Kerwin
  http://matthew.kerwin.net.au/

Received on Tuesday, 22 December 2015 04:56:52 UTC