W3C home > Mailing lists > Public > ietf-http-wg-old@w3.org > May to August 1998

Re: ISSUE: transformations

From: Jeffrey Mogul <mogul@pa.dec.com>
Date: Wed, 12 Aug 98 16:37:56 MDT
Message-Id: <9808122337.AA11651@acetes.pa.dec.com>
To: HTTP Working Group <http-wg@hplb.hpl.hp.com>
    The wordage around application of transformations to entities [by
    proxies] seems to deal only with what headers can be changed
    (presumably interoperability issues are thought to only be possible
    in that case).

    However, imagine for a moment that my proxy applies a
    transformation that neither changes the Content-Type nor the
    Content-Encoding (for the sake of argument, call it the "ee
    cummings" transform: on text types only, lowercase and remove
    punctuation).  Even for the example used in the spec, if one were
    to change the resolution of a medical image, it would be a
    potentially bad thing.
    
I agree, the normative wording seems to be insufficient, and
not consistent with the rationale given just before it.  In
section 14.9.5 (No-Transform Directive), the rationale says:

  Serious operational problems occur, however, when these
  transformations are applied to entity bodies intended for certain
  kinds of applications. For example, applications for medical imaging,
  scientific data analysis and those using end-to-end authentication,
  all depend on receiving an entity body that is bit for bit identical
  to the original entity-body.

but the normative wording is:

  Therefore, if a message includes the no-transform directive, an
  intermediate cache or proxy MUST NOT change those headers that are
  listed in section 13.5.2 as being subject to the no-transform
  directive. This implies that the cache or proxy MUST NOT change any
  aspect of the entity-body that is specified by these headers.

with the 13.5.2 list of headers as:
  .  Content-Encoding
  .  Content-Range
  .  Content-Type

For example, this means that a response that looks like this

	HTTP/1.1 200 OK
	Content-Type: image/gif
	cache-control: no-transform

could be modified so that the image is still a GIF file, but
has had 99% of its bits removed.  Not really consistent with
the "bit for bit identical" criterion in the rationale!

My recollection is that we fully intended "no-transform"
to prevent such transformations, as well as the ones
currently specified.  Someone (maybe me) must have screwed
up when it came to writing the normative wording.

    1. Content-MD5, for example, is an end-to-end header (which,
    according to 13.5.1, MUST be stored and forwarded, and according to
    14.15 MUST NOT be generated by proxies.  If I leave it in, it will
    be wrong.  If I remove it, I run the risk of breaking an
    application.  Gaaack.

Content-MD5 can't be changed or generated by proxies.  But there is no
prohibition that a response with a Content-MD5 header has to be
forwarded without transformation.  I.e., the spec allows a proxy to
transform a message with a Content-MD5 header in such a way that the
Content-MD5 value no longer matches the contents.

One might argue that this is OK.  I.e., if the Content-MD5 is wrong,
then the recipient can assume that the message has been transformed
(making a Warning superfluous).

One might argue that a response with both "no-transform" and
a Content-MD5 shouldn't be transformed in such a way as to
change the MD5 hash of the body.  We could add a normative
MUST NOT along these lines (say, to the definition of
"no-transform", section 14.9.5):

    If a message contains the no-transform directive and also
    includes a Content-MD5 header, an intermediate cache or
    proxy MUST NOT change the value of the message body.

Alternatively, one could make a minor modification to the
existing text:

  Therefore, if a message includes the no-transform directive, an
  intermediate cache or proxy MUST NOT change those headers that are
  listed in section 13.5.2 as being subject to the no-transform
  directive. This implies that the cache or proxy MUST NOT change any
  aspect of the entity-body that is specified by these headers,
  [or by the Content-MD5 header field, if present].

This means that one way to prevent a transformation would be
to attach a Content-MD5 header field to the message.  However,
the computational cost of doing this, while not excessive, is
not negligible, and so it's kind of a kludge.

Another alternative would be to say:
	
  Therefore, if a message includes the no-transform directive, an
  intermediate cache or proxy MUST NOT change those headers that are
  listed in section 13.5.2 as being subject to the no-transform
  directive. This implies that the cache or proxy MUST NOT change any
  aspect of the entity-body that is specified by these headers,
  including the value of the entity-body itself.

which is what intuition suggests that "no-transform" ought to mean,
and probably what we really meant to write.

Note that the utility of Content-MD5 is perhaps suspect, since
it applies to the message and not to the underlying thing.  So
it totally breaks down when trying to reassemble something at
an intermediate proxy cache out of byte-range responses, for
example, or from the proposed delta-encoded responses.  Which
is why some of us have proposed adding headers for "instance digests":
   http://search.ietf.org/internet-drafts/draft-mogul-http-digest-00.txt

In order for HTTP/1.1 proxies to "do the right thing" if this
kind of extension is adopted later, the meaning of "no-transform"
ought to be independent of whether or not the proxy knows that
a set of header names is somehow special.  Based on that, I would
say that we should adopt the final correction that I proposed;
not only is it the most intuitive interpretation of "no-transform",
but it's also the most extensible.

    2. This would lead me to think that I should add a "Warning: 214"
    even if I do not change the Content-Type or Content-Encoding (and
    that perhaps the spec should be changed to require this).
    
Right.  I think the specification here should be changed from

    214 Transformation applied
      MUST be added by an intermediate cache or proxy if it applies any
      transformation changing the content-coding (as specified in the
      Content-Encoding header) or media-type (as specified in the Content-
      Type header) of the response, unless this Warning code already
      appears in the response.

to

    214 Transformation applied
      MUST be added by an intermediate cache or proxy if it applies any
      transformation changing the content-coding (as specified in the
      Content-Encoding header) or media-type (as specified in the
      Content-Type header), or the entity-body of the response, unless
      this Warning code already appears in the response.

since otherwise there is no way to know if a proxy has, for
example, removed 99% of the bits from a GIF file.

-Jeff
Received on Wednesday, 12 August 1998 16:39:50 EDT

This archive was generated by hypermail pre-2.1.9 : Wednesday, 24 September 2003 06:33:20 EDT