- From: Roy T. Fielding <fielding@kiwi.ICS.UCI.EDU>
- Date: Thu, 20 Feb 1997 18:56:27 -0800
- To: Jeffrey Mogul <mogul@pa.dec.com>
- Cc: http-wg@cuckoo.hpl.hp.com
I think I understand now what Henrik was describing, and I agreed that the description of Accept-Encoding needs to be fixed. However, not all of it is broken. Jeffrey Mogul writes: >There are three problems with RFC2068 that would prevent the most >efficient use of this compression algorithm, and that might result >in presenting users with bogus results: > > (1) The current specification of Accept-encoding *requires* > (SHOULD-level, not MUST-level) a server to return an > error response in a situation where this is probably > not optimal. This might lead to many extra round-trips, > and might also lead to the destruction of otherwise > useful proxy-cache entries. We should fix that in the spec. > (2) The current specification of Accept-encoding *allows* > a server to send a response using an encoding that the > client software might not only not understand, but which > it might improperly render to an unwitting user. Yes. It does so because there may exist some external mechanism, such as a choice list provided in HTML, whereby an older user agent might GET on a particular URL known to the human user to be in a specific encoding, even though the user agent has no knowledge of that encoding. Keep in mind that not all GET requests are done for the purpose of rendering on a browser, and therefore the protocol must not create artificial requirements that restrict the content of the payload of a response. For example, I have a save-URL-to-file program called lwpget. It never sends Accept-Encoding, because there has never been any requirement that it should, and no HTTP/1.0 server ever needed it. Should lwpget be prevented from working because of the *possibility* that it might be a rendering engine that doesn't understand the encoding? > (3) The current design allows an HTTP/1.0 cache to return > an encoded response to an HTTP/1.0 client, in such a way > as to cause the client to render garbage to an unwitting user. Yes. It is the responsibility of the origin server to prevent this from happening by accident. It is not possible to prevent it from happening on purpose, because attempting to do so breaks my (2). >Jim's proposal is: > > If an Accept-Encoding header is present, and if the server cannot > send a response which is acceptable according to the > Accept-Encoding header, then the server SHOULD send a response > using the default (identity) encoding; if the identity encoding > is not available, then the server SHOULD send an error response > with the 406 (Not Acceptable) status code. > >That solves the problem with scenario #2, but not with scenario #1. Yes, that works because it is prefixed with "If an Accept-Encoding header is present ..." >I have three different proposals to solve these two problems, >in order of increasing distance from current practice (and in >order of increasing precision, I think). > >The simplest change would be to say: > > If no Accept-Encoding header is present in the request, then > the server SHOULD respond using one of > o the default (identity) content-coding; or > o the "compress" content-coding; or > o the "gzip" content-coding > It MUST not respond using any other content-coding. If none > of these content-codings is available, the server SHOULD send > an error response with the 406 (Not Acceptable) status code. That would break my application. > Note: the use of unsolicited compressed encodings may > lead to confusing errors in rendering the response, and > should be done with caution. > > If an Accept-encoding header is present, and if the server cannot > send a response which is acceptable according to the > Accept-Encoding header, then the server SHOULD send a response > using the default (identity) content-coding; it MUST NOT send a > non-identity content-coding not listed in the Accept-encoding > header. If, in this case, the identity content-coding is not > available, then the server SHOULD send an error response with the > 406 (Not Acceptable) status code. > >Actually, because the HTTP/1.1 spec does not explicitly require >a client to support any of the non-identity content-codings, it >seems smarter to use something like the following wording instead: > > If no Accept-Encoding header is present in the request, then > the server SHOULD respond using the default (identity) content-coding. > It MUST not respond using any other content-coding. If none > of these content-codings is available, the server SHOULD send > an error response with the 406 (Not Acceptable) status code. That would break my application. > If an Accept-encoding header is present, and if the server cannot > send a response which is acceptable according to the > Accept-Encoding header, then the server SHOULD send a response > using the default (identity) content-coding; it MUST NOT send a > non-identity content-coding not listed in the Accept-encoding > header. If, in this case, the identity content-coding is not > available, then the server SHOULD send an error response with the > 406 (Not Acceptable) status code. > >And, if we want to make it possible for a client to say "send me >a compressed encoding or send me nothing", then I'd propose this >pair of changes > >(1) in section 3.5 (Content Codings), add this after the item >for "deflate" > > identity The default (identity) encoding; the use > of no transformation whatsoever. This > content-coding is used only in the > Accept-encoding header, and SHOULD NOT > be used in Content-coding header. That would be fine, though an Accept-Encoding with no value was originally intended to mean "I only accept the identity encoding". >==================== > >Now, on to problem #3. > >Suppose one has this configuration: > > > |--- HTTP/1.1 client A > | >HTTP/1.1 server S ---- HTTP/1.0 proxy P ---- > with cache | > |--- HTTP/1.0 client B > >Now suppose that client A does > GET http://S/foo.html HTTP/1.1 > Host: S > Accept-Encoding: zipflate > >via proxy P, which forwards it to server S, which responds with > > HTTP/1.1 200 OK > Content-Encoding: zipflate > Content-type: text/html > Last-Modifed: ..... > Expires: ..... > Cache-control: ..... You forgot to add Vary: Accept-Encoding Yes, I know that an HTTP/1.0 proxy cache will probably ignore it, but there is only so much we can do without breaking the protocol. >Proxy P caches this response and forwards it to client A. So far, >so good. > >Soon thereafter (before the Expires time), client B decides to issue its >own request for the same URL: > GET http://S/foo.html HTTP/1.0 > >Since HTTP/1.0 proxy P doesn't understand "Accept-Encoding", as far as >I can tell, it's likely to return the cached response to B. But client >B's HTTP/1.0 browser won't know how to render it. If that software >is smart, it might re-issue the request with a "Pragma: no-cache". >But I doubt that any existing browsers are this smart, with the >result that B's user (e.g., my mom) would be faced with a mysterious >error message (or a screen full of garbage). The answer is: get a better proxy cache. Seriously, there comes a point when we must recognize the limitations of older technology and move on. Barring fatal problems (this is not one of them), it is appropriate that users of inadequate technology receive inadequate results. Keep in mind, however, that this particular scenario only occurs if the URL in question has negotiated responses based on Accept-Encoding. It is quite reasonable for the origin server to modify its negotiation algorithm based on the capabilities of the user agent, or even the fact that it was passed through a particular cache; I even described that in section 12.1. >I suppose one could hope that HTTP/1.0 caches don't store responses >with a Content-encoding header, but I looked at the sources for >the CERN httpd, and it doesn't seem to pay any attention. > >The HTTP/1.0 "specification" defines the "gzip" and "compress" >content-codings, but does not define "deflate", so it is reasonable >to assume that many (if not all) HTTP/1.0 clients and proxies do >not understand the full set of content-codings already specified >in HTTP/1.1, let alone anything new that comes along. No, that is not a reasonable assumption. While browsers may not understand those encodings for the purpose of in-line rendering, not all browser requests are for the purpose of rendering, and not all clients are browsers. >I propose that we add a new status code, analogous to 206 (Partial >Content), to be used on all HTTP/1.1 responses with a non-identity >Content-coding. For example, 207 (Encoded Content). This would allow >HTTP/1.0 caches to forward, but not to cache, the response; it would >allow HTTP/1.1 implementations to do whatever is appropriate. (I.e., >an HTTP/1.1 cache would have to check the Content-Encoding against the >Accept-Encoding of a subsequent request.) No, I don't even consider that an option. 206 works because the user agent will only receive it if it asks for a Range, and therefore we know that it won't puke. Besides, it breaks the distinction between the response status and the payload content, which would be extremely depressing for the future evolution of HTTP. I suggest the following instead: If no Accept-Encoding field is present in a request, the server MAY assume that the client will accept any content coding. However, if the response content is negotiated on the basis of Accept-Encoding, then the origin server SHOULD select a representation without any Content-Encoding if one is available; if all available representations use a non-identity content-coding, then preference should be given to those content-coding(s) commonly understood by older user agents, or known to be understood by the particular user agent that initiated the request. If an Accept-Encoding field is present, and if the server cannot send a response which is acceptable according to the Accept-Encoding field, then the server SHOULD send a response using the default (identity) content-coding; it MUST NOT send a non-identity content-coding not listed in the Accept-Encoding field. If, in this case, the identity content-coding is not available, then the server SHOULD send an error response with the 406 (Not Acceptable) status code. I think that will accomplish the desired effect without preventing existing applications (mirrors, Save As dialogs, etc.) from working. Cheers, ...Roy T. Fielding Department of Information & Computer Science (fielding@ics.uci.edu) University of California, Irvine, CA 92697-3425 fax:+1(714)824-4056 http://www.ics.uci.edu/~fielding/
Received on Thursday, 20 February 1997 19:07:53 UTC