CT Guidelines and Cache-Control: no-transform

In the current draft of the CT Guidelines, we have the following
paragraphs at the end of section 4.4:

------
If the response includes a Cache-Control: no-transform directive then
the response *must* remain unaltered other than to comply with
transparent HTTP behavior and other than as noted below.

If the proxy determines that the resource as currently represented is
likely to cause serious mis-operation of the user agent then it *may*,
with the users explicit prior consent, warn the user and provide links
to both transformed and unaltered versions of the resource.
------

I wonder if this might be too strict.  Here are a few reasons:

1) A web page that is too large for the memory of the device could cause
it to crash or hang.  Segmentation into smaller pages would be a useful
transformation in this case.  This may be covered by the second
paragraph; maybe an example would make this clearer.

2) CT proxy operators frequently like to add headers and/or footers to
web pages that contain useful tools such as a link to the home page, a
link to bookmarks, a "Go To URL" box, etc.  If some pages had headers
and/or footers and some pages did not, this could be confusing and
frustrating to users.

3) With a CT proxy that does URL rewriting (as opposed to a CT proxy
that is set up as an HTTP proxy), once a user goes to a page that is not
transformed (i.e., does not have its URLs rewritten), it may not be
obvious to the user how to return to using the CT proxy.  The links on
the page no longer point to the CT proxy.

4) There could also be a billing issue with a URL-rewriting CT proxy
that does not transform a page.  Frequently, operators of CT proxies
bill differently for transactions that go through the CT proxy and
transactions that do not (billing is based on domain).  An example would
be a flat monthly fee for transactions that go though the CT proxy and
billing by the KB for transactions that do not go through the CT proxy.
Once a user receives a page that doesn't have rewritten URLs, he or she
could incur larger data charges for Web usage without realizing it.

5) One could also envision some security advantages to allowing URLs to
be rewritten even for "no-transform" pages.  For example, phishing pages
or viruses could be detected by the CT proxy and blocked.

6) Sometimes images or other media are marked no-transform.  If the
client can't handle the image format delivered by the content server
(the content server would have to ignore the Accept header in this
case), it may make more sense to transform the image into an understood
format instead of delivering the unsupported image to the client.

Obviously if it was decided that a no-transform page should be
transformed, the goal would be to keep the formatting and visible
content of the page identical to the original page (between the headers
and/or footers).  You'd also probably want to get the user's permission
to do this.


Sean

Received on Friday, 23 May 2008 21:07:24 UTC