Caching Problem with 1.0 and a Proxy Server from Tim Coates on 2000-11-08 (ietf-http-wg@w3.org from October to December 2000)

From: Tim Coates <tcoates@dynamics.net>
Date: Wed, 8 Nov 2000 11:30:20 +1100
To: http-wg@cuckoo.hpl.hp.com
Message-ID: <NEBBIHMBBKBFHCINLBBGCEBJCCAA.tcoates@dynamics.net>

A small problem to overcome...

We already know the caching problems with 1.0 but....

Firstly, assume that no proxy server is used. A browser send a a request
using HTTP 1.1 protocol. The response it receives from the web server is
also HTTP 1.1. This can be proven viewing the logs provided by a packet
sniffer. The document received by the browser contains 1.1 header for cache
control. The cache control instructions basically force the brower to do a
get each time the user requests the page.

Now... add a proxy server. I had to modify the Windows registry so that the
browser would issue a 1.1 request to the web server (via the proxy). The
browser sends a 1.1 request, but receives a 1.0 response - this is from the
same web server, and I am using the same browser.

What appears to happen is that the proxy server downgrades the protocol
identifier (at least in the response back to the web browser) and forwards
the response containing the 1.1 cache control headers to the browser. The
browser now interprets the response using the 1.0 instructions and the end
result is that the page is caches. If I ask for the same page again (and not
using a forced GET) the page is retrieved from the cache as no GET request
is captured by the packet sniffer.

I have different browser that upon receipt of a response that indicates a
1.0 response, but contains 1.1 headers seems to disregard the protocol
identifier in the response and processes each header instruction received.
The end result is that even though the protocol identifier says it is 1.0,
when a GET is issued, the page is requested from the web server.

Note that in the example given above the same web page was requested, from
the same web server. We know that a proxy server downgrades the response,
but that affects the way the response is also handled by the browser.

Questions:

1. What meaning (other than the obvious) is attached to the protocol
identifier (e.g. HTTP/1.0) that is contained in a browser request or
response?

2. How strictly should a web browser use the protocol identifier in
processing a response from a web server?

Thanks,
Tim C.

Received on Tuesday, 7 November 2000 17:35:10 UTC