Re: Non-persistent Cookie proposal

koen@win.tue.nl (Koen Holtman) wrote:
  > [... concerned about caching issues with State-Info (Session-ID) ...]
  > 
  > Non-persistent Cookie proposal
  > ==============================
  > 
  >                                         Koen Holtman, koen@win.tue.nl
  > 
  > This document can be seen as a personal summary of some parts of the
  > the session-id discussion on www-talk in July 1995.  [2] contains a
  > summary of privacy issues discussed.  
  > 
  > I do not want to argue that the proposal below is the only way to go
  > when putting support for stateful dialogs in http.
  > 
  > Simpler mechanisms are possible, for example [3].  The `solution
  > space' for stateful dialogs has a number of dimensions:
  > 
  >  - simplicity of implementation
  >  - time of general availability when standardized
  >  - downward compatibility
  >  - simplicity of use
  >  - reliability
  >  - amount of privacy protection
  >  - maximum complexity of stateful dialogs supported
  >  - amount of cache control possible
  >  - risks when used with non-conforming caches
  >  - amount of confusion on www-talk generated
This is a great list!  Mind if I steal it for my proposal?
  > 
  > The proposal below occupies one point in this solution space.
  > 
  > The main goal of the text below is to make all dimensions visible, not
  > to give a personal opinion on what the best point in the space is.  In
  > fact, my personal opinion on this has changed a lot over the last
  > month, and I have not reached a final conclusion yet.
  > 
  > 
  > ** Background
  > 
  > This document assumes that you have read [1] and [2], and that you are
  > familiar with [4].
  > 
  > [1] the NetScape Cookie proposal
  > <URL:http://home.mcom.com/newsref/std/cookie_spec.html>.
  > 
  > [2] the proposals for of gathering consumer demographics
  > <URL:http://www.w3.org/hypertext/WWW/Protocols/demographics.html> by
  > Daniel W. Connolly.
  > 
  > [3] Session-ID (State-Info) proposal by Dave Kristol
  > <URL:http://www.research.att.com/~dmk/session.html>
  > 
  > [4] The draft HTTP/1.0 specification
  > <URL:http://www.ics.uci.edu/pub/ietf/http/>
  > 
  > 
  > ** Terminology
  > 
  > Most terminology in this document is as in [4].  New terminology:
  > 
  > Browser: user agent (used for interactive web sessions).
  > 
  > Stateful dialog: An information exchange between a web user and a web
  > server that that extends beyond the submission of one form.  In a
  > stateful dialog, the server changes its behavior as a result of
  > previous actions by the user.  Conceptually, the state is a property
  > of the dialog (or session), not of the browser or server.  On the
  > implementation side, either the browser, the server, or both hold the
  > state.
You failed to use the [4] terminology above and below.  In particular I
think you mean "origin server" where you say "server" and "web server".
  > 
  > Cookie: a string to be sent by a browser to a server in requests,
  > representing the state in a stateful dialog.  The string is kept at
  > the browser side, but supplied and updated by the server using
  > set-cookie response headers.  Only the server need to be able do
  > decode the cookie.  See [1] for an example cookie definition.  A
Actually, Netscape's cookies are interpreted by the browser to the extent
that it returns them selectively to the origin server, based on request
URL.  It also makes them expire.  Those two points are different from my
proposal, where the browser does not peek inside.
  > cookie can represent the entire dialog state directly, or be a key in
  > a server-side dialog state database.
  > 
  > Request-id: a unique value sent by a browser to a server in requests
  > to allow the server to generate better clicktrail statistics. 
  > 
  > Session-id: a unique identifier sent by a browser to a server in
  > requests to give the server a way of keeping it apart from other
  > browsers for the purpose of engaging in stateful dialogs.  Unlike a
  > cookie, the value of a session-id can never change during a stateful 
  > dialog.
If you're referring to my State-Info (formerly Session-ID) proposal, this
is a misunderstanding.  The server is free to change the State-Info in
responses.
  > 
  > Persistent information: information that is remembered by the browser
  > for future sessions
  > 
  > Non-persistent information: information that is lost when the browser
  > exits
Perhaps you need to add words about what happens when one window of a
multi-window browser exits.
  > 
  > 
  > 
  > Proposal:  Non-persistent cookies
  > =================================
  > 
  > ** 1. The set-cookie response header
  > 
  > A server may choose to send a set-cookie response header in a response 
  > to a browser.  The header looks like
  > 
  >    Set-Cookie: <string>
  > 
  > where <string> does not contain whitespace. (This allows for future 
  > extensions).  Examples:
  > 
  >    Set-Cookie: color_screen;no_jpgs;small_graphics
  >    Set-Cookie: Joe%20User;joe@foo.com
  >    Set-Cookie: id324254
  >    Set-Cookie: joe:4234982343
  > 
  > If the browser receives a Set-Cookie response header, it can either
  > honor it or ignore it.
  > 
  > ** 2. Honoring a Set-Cookie header
  > 
  > To honor a set-cookie header received from a server, a browser will 
  > start including a header
  > 
  >    Cookie: <string>
  > 
  > in _direct_ requests to that server, and to that server only.  For the
  > purpose of this definition, individual servers are identified by the
  > hostname+portnumber pair in the request URL.  Directness of requests
  > will be defined later.
  > 
  > The cookie <string> is taken from the Set-cookie header, and may be
  > changed by the server by sending a new Set-cookie header with a new
  > string.  A Set-cookie header with an empty string can be taken as a
  > request to stop sending Cookie headers.
  > 
  > 
  > ** 3. The decision to honor a Set-Cookie header
  > 
  > If a Set-Cookie response header is honored, this means that a server
  > can get more accurate statistics about the behavior of the browser
But the purpose of cookies is much more than just allowing an origin
server to gather statistics.
  > user if the browser user is behind a firewall or proxy cache.  Thus,
  > the user should have the option of deciding not to honor Set-cookie
  > headers for privacy reasons.
  > 
  > It is suggested that browsers provide something like the following
  > preferences box:
  > 
  >    +-----------------------------------------------------------------+
  >     Honor set-cookie requests:
  >         ( ) Always honor request
  >         ( ) Start honoring requests if one is done in a response to
  >             a form submission (POST request).
  >         (*) Ask once for every site, use reply in later sessions
  >         ( ) Never honor requests
  >    +-----------------------------------------------------------------+
  > 
  > where the (*) is the default setting.  In real browsers, the
  > terminology used in this box should probably be adapted to make more
  > sense to the average user.
  > 
  > As a matter of etiquette, services should only send Set-Cookie headers
  > if honoring them would bring some direct advantage to the user, like
  > being able to use a stateful dialog.  Sending Set-Cookie headers only
  > to get better clicktrail statistics should be considered a breach of
  > etiquette.  To get clicktrail statistics, the Request-id header (see
So cookies are not for statistics!?
  > the appendix below or [2]) should be used.
  > 
  > If a browser decides to honor one Set-Cookie header from a server, the
  > service author can expect that Set-Cookie headers sent in the near
  > future will also be honored.
  > 
  > A browser can contain a timeout mechanism to stop honoring Set-cookie
  > headers when, say, 30 minutes have past since the last contact with
  > the server.
I think you mean that the browser will flush remembered cookies for an
origin server at some chosen time period after the last contact with
it.  I assume the browser would honor new Set-cookie's if it contacted
the origin server again after that period.
  > 
  > 
  > ** 4. Direct and Indirect Requests
  > 
  > A request is indirect if it
  >  1) fetches the contents of an inline picture or other inlined object
  >  2) resolves a 3xx (redirection) response
  > 
  > All other requests are direct.
  > 
  > If a browser is to honor a Set-Cookie header, Cookie headers must be
  > sent in _direct_ requests to the server.  It is preferred that Cookie
  > headers are never sent in indirect requests to a server.
  > 
  > The distinction between direct and indirect requests is made for two
  > reasons:
  > 
  >  a) this allows for better caching of stateful dialogs, as discussed
  >     below
  >  b) this allows for better privacy: it makes impossible some
  >     `stealthy' cookie matching strategies that could be adopted by
  >     cooperating web service providers to allow matching clicktrails.
  > 
  > 
  > ** 5. Cookie headers and caching: the default
  > 
  > By default, a cache, no matter whether it is in a browser or in a
  > proxy, must never cache responses to requests with Cookie headers in
  > them.
  > 
  > Responses which contain Set-Cookie headers must also never be cached,
  > no matter whether the request itself contained a Cookie header.
  > 
  > There are three reasons for this default
  > 
  >  1) Responses in stateful dialogs are dynamic by nature.  No big
  >     payoff can be expected from caching them.  In fact, in a scheme
  >     where they can be cached, there is a danger of cache memories
  >     dropping useful `static' responses (like inline pictures from
  >     normal sites) to store relatively useless dynamic responses.

This seems to be unduly restrictive.  In recent messages on this list I
have described how a shopping application might consist of pages that
describe a product, and that have a link to "Show me my current
shopping basket." The product description could benefit from caching,
even though the request might have contained a Cookie (State-Info in my
proposal) header.
  > 
  >  2) It is vital that responses in stateful dialogs are never cached,
  >     else services using stateful dialogs would become unreliable.
  > 
  >     An Expires: <yesterday> header in a response can be used to
  >     disable caching (in both browsers and proxy caches), but some
  >     system operators may be tempted to `tune' proxy or browser caches
  >     under their control to keep expired responses around for, say, 5
  >     minutes, even though this makes the cache non-conformant to the
  >     http spec.
  > 
  >     Such `tuning' is relatively harmless for normal, non-interactive
  >     services, but disastrous for stateful dialog reliability.  As
  >     stateful dialogs are still relatively uncommon, it is a valid
  >     assumption that many system operators are not aware of the
  >     stateful dialog risks involved with `Expires tuning', so
  >     non-conforming caches may remain with us for some time to come.
  > 
  >     By making responses in stateful dialogs using the Cookie headers
  >     a special case in the cache algorithm, independent of `Expires
  >     tuning', this particular non-conformance risk to stateful dialogs
  >     is eliminated.

It seems to me it will take at least as long to deploy caching proxies
that understand (and don't cache) requests/responses that use
Cookie/Set-cookie as it will to fix badly tuned caching proxies.
  > 
  >     (Note: the Session-ID (State-Info) proposal by Dave Kristol [3]
  >     assumes that system operators will never do such `tuning': this
  >     allows proposal [3] to be very much simpler than this cookie
  >     proposal.)
  > 
  >  3) The developer of a stateful dialog service will usually not have a
  >     cache between his browser and the CGI scripts under development.
  >     If caching were enabled by default on requests with Cookie
  >     headers, the author would have to remember to put Expires:
  >     <yesterday> headers in the responses generated by the scripts; if
  >     a forgetful or badly informed author of a stateful dialog service
  >     would not do this, the resulting unreliability would not show up
  >     on tests within the local environment.

One hopes such a developer would also beta test the application in a more
realistic environment, too.
  > 
  > 
  > This default behavior of _not_ caching responses to requests with
  > Cookie headers is the main reason why indirect requests, which do not
  > get Cookie headers, were introduced.  An inline picture request is
  > indirect, so inline pictures on stateful dialog pages will get cached
  > by default.
  > 
  > Of course, if caching of an inline picture is not desirable, the
  > service author can always put an Expires header in the inline picture
  > response.
  > 
  > If a page contains pictures that depend on the dialog state, the
  > service author can implement these state dependent pictures by making
  > the generation of the URLs in the <IMG SRC=...> tags depend on the
  > state.  Similar state-dependent URL generation can be done for
  > redirection (3xx) codes.
  > 
  > 
  > ** 6. Cookie headers and caching: overriding the default
  > 
  > If the response to a request on URL U with a Cookie header contains an
  > 
  >   Expires: <date>
  > 
  > header, and no Set-Cookie header, a cache can interpret this to mean
  > two things:
I thought a caching proxy was obliged never to cache a response to a
request that carried a Cookie header (sect. 4).
  > 
  >  1) the entity included may be cached, but not beyond the
  >     <date> given,
Before you didn't trust caching proxy operators to treat Expires
correctly.  Now you do?
  >  2) the entity in the response does *not* depend on the dialog state.
  >     Thus, if the entity is cached and some browser does a request on
  >     URL U, it is OK to serve the cached entity, no matter what the
  >     cookie header value in the request is, even no matter whether a
  >     cookie header is present at all in the request.
  > 
  > Note that it makes no sense for servers to send both a Set-Cookie
  > header and an Expires header in the same response.  Servers may
  > however choose to do so to get backwards compatibility with old
  > proxy caches.
  > 
  > Also note that it makes no sense to put Expires: <yesterday> headers
  > in responses to requests which contain Cookie headers.  Servers may
  > however choose to do so to get backwards compatibility with old proxy
  > caches.
  > 
  > This way of defining Expires: semantics ensures that caches need never
  > consider the cookie header when accessing their cache memory.  The
  > Cookie header is never part of the cache key, the header is only
  > important when making the decision whether to cache or not.

  > [ Request-ID proposal omitted ]

Dave Kristol

Received on Friday, 11 August 1995 13:40:31 UTC