- From: Jeffrey Mogul <mogul@pa.dec.com>
- Date: Wed, 12 Aug 98 16:37:56 MDT
- To: HTTP Working Group <http-wg@hplb.hpl.hp.com>
The wordage around application of transformations to entities [by proxies] seems to deal only with what headers can be changed (presumably interoperability issues are thought to only be possible in that case). However, imagine for a moment that my proxy applies a transformation that neither changes the Content-Type nor the Content-Encoding (for the sake of argument, call it the "ee cummings" transform: on text types only, lowercase and remove punctuation). Even for the example used in the spec, if one were to change the resolution of a medical image, it would be a potentially bad thing. I agree, the normative wording seems to be insufficient, and not consistent with the rationale given just before it. In section 14.9.5 (No-Transform Directive), the rationale says: Serious operational problems occur, however, when these transformations are applied to entity bodies intended for certain kinds of applications. For example, applications for medical imaging, scientific data analysis and those using end-to-end authentication, all depend on receiving an entity body that is bit for bit identical to the original entity-body. but the normative wording is: Therefore, if a message includes the no-transform directive, an intermediate cache or proxy MUST NOT change those headers that are listed in section 13.5.2 as being subject to the no-transform directive. This implies that the cache or proxy MUST NOT change any aspect of the entity-body that is specified by these headers. with the 13.5.2 list of headers as: . Content-Encoding . Content-Range . Content-Type For example, this means that a response that looks like this HTTP/1.1 200 OK Content-Type: image/gif cache-control: no-transform could be modified so that the image is still a GIF file, but has had 99% of its bits removed. Not really consistent with the "bit for bit identical" criterion in the rationale! My recollection is that we fully intended "no-transform" to prevent such transformations, as well as the ones currently specified. Someone (maybe me) must have screwed up when it came to writing the normative wording. 1. Content-MD5, for example, is an end-to-end header (which, according to 13.5.1, MUST be stored and forwarded, and according to 14.15 MUST NOT be generated by proxies. If I leave it in, it will be wrong. If I remove it, I run the risk of breaking an application. Gaaack. Content-MD5 can't be changed or generated by proxies. But there is no prohibition that a response with a Content-MD5 header has to be forwarded without transformation. I.e., the spec allows a proxy to transform a message with a Content-MD5 header in such a way that the Content-MD5 value no longer matches the contents. One might argue that this is OK. I.e., if the Content-MD5 is wrong, then the recipient can assume that the message has been transformed (making a Warning superfluous). One might argue that a response with both "no-transform" and a Content-MD5 shouldn't be transformed in such a way as to change the MD5 hash of the body. We could add a normative MUST NOT along these lines (say, to the definition of "no-transform", section 14.9.5): If a message contains the no-transform directive and also includes a Content-MD5 header, an intermediate cache or proxy MUST NOT change the value of the message body. Alternatively, one could make a minor modification to the existing text: Therefore, if a message includes the no-transform directive, an intermediate cache or proxy MUST NOT change those headers that are listed in section 13.5.2 as being subject to the no-transform directive. This implies that the cache or proxy MUST NOT change any aspect of the entity-body that is specified by these headers, [or by the Content-MD5 header field, if present]. This means that one way to prevent a transformation would be to attach a Content-MD5 header field to the message. However, the computational cost of doing this, while not excessive, is not negligible, and so it's kind of a kludge. Another alternative would be to say: Therefore, if a message includes the no-transform directive, an intermediate cache or proxy MUST NOT change those headers that are listed in section 13.5.2 as being subject to the no-transform directive. This implies that the cache or proxy MUST NOT change any aspect of the entity-body that is specified by these headers, including the value of the entity-body itself. which is what intuition suggests that "no-transform" ought to mean, and probably what we really meant to write. Note that the utility of Content-MD5 is perhaps suspect, since it applies to the message and not to the underlying thing. So it totally breaks down when trying to reassemble something at an intermediate proxy cache out of byte-range responses, for example, or from the proposed delta-encoded responses. Which is why some of us have proposed adding headers for "instance digests": http://search.ietf.org/internet-drafts/draft-mogul-http-digest-00.txt In order for HTTP/1.1 proxies to "do the right thing" if this kind of extension is adopted later, the meaning of "no-transform" ought to be independent of whether or not the proxy knows that a set of header names is somehow special. Based on that, I would say that we should adopt the final correction that I proposed; not only is it the most intuitive interpretation of "no-transform", but it's also the most extensible. 2. This would lead me to think that I should add a "Warning: 214" even if I do not change the Content-Type or Content-Encoding (and that perhaps the spec should be changed to require this). Right. I think the specification here should be changed from 214 Transformation applied MUST be added by an intermediate cache or proxy if it applies any transformation changing the content-coding (as specified in the Content-Encoding header) or media-type (as specified in the Content- Type header) of the response, unless this Warning code already appears in the response. to 214 Transformation applied MUST be added by an intermediate cache or proxy if it applies any transformation changing the content-coding (as specified in the Content-Encoding header) or media-type (as specified in the Content-Type header), or the entity-body of the response, unless this Warning code already appears in the response. since otherwise there is no way to know if a proxy has, for example, removed 99% of the bits from a GIF file. -Jeff
Received on Wednesday, 12 August 1998 16:39:50 UTC