An updated Request-id proposal

To write the text below, I took proposal I. from
<URL:http://www.w3.org/hypertext/WWW/Protocols/demographics.html>, and
changed things according to issues (mainly connected to privacy
and caching) discussed in the Session-id threads on www-talk.

The text below can be seen as a personal summary of the parts of
these threads that pertain to Request-IDs and privacy.

------snip----

The Request-ID: header field.

   Adapted from the proposal in
   <URL:http://www.w3.org/hypertext/WWW/Protocols/demographics.html>.

   Am HTTP request may include a header field of the form:

        Request-ID: $session $request++

   e.g.

        Request-ID: 342%33a4d443 12

   The HTTP client chooses a random string as a "session identifier",
   and each request in that session is identified by a number that
   increases monotonically with time.

   It is suggested that clients use a different random $session string
   for each server they talk to.  This will make it more difficult for
   cooperating web service providers to match clicktrails in their
   logfiles, thereby getting user profiling information that is much
   more accurate than the user would want to give them without some
   form of compensation.  Note that it is illegal to match logfiles
   under the privacy laws in some countries.  The suggestion to use
   different $session strings can be seen as supporting these laws by
   making the crime of matching logfiles pay off less.

   A "session" is not formally defined (other than "a set of requests
   with the same $session id"), though I suggest that browsers begin a
   session when they are invoked and when they have been idle for 30
   minutes or more, and allow some user interface to say "start a new
   session" (i.e. "choose a new random session ID").

   Each user agent must provide a mechanism to turn the generation of
   Request-Ids off, especially for site security administrators that
   prohibit its use.

   If no Request-ID headers are present, this should be interpreted by
   web service providers as a statement that the user does not wish to
   reveal his or her exact clicktrail for privacy reasons.  An attempt
   by service providers to silently obtain the clicktrail by some
   other means (for example by using a session-id, cookie, or
   anonymous authentication mechanism that could be part of future
   versions of HTTP), should be considered to violate the privacy
   wishes of the user.

   Whether HTTP clients use a global $request counter, or one counter
   for each server talked to, is up to the clients.  HTTP clients
   which are not traditional user agents (e.g. multi-threaded robots)
   may use several sessions in parallel.

   A proxy must pass the Request-ID: header through unmodified. One might
   consider some sort of Proxy-Request-ID, though I doubt it would be
   valuable.

   An HTTP cache can assume that the response to an HTTP request does
   _not_ vary as a function of the Request-ID.  That is, an HTTP proxy
   need not include the Request-ID in its "cache key."  If the
   response to a request can vary, an Expires header should be used in
   the response to reflect this dynamism.

   It is preferred that the request-ID header is _not_ used to
   implement statefull dialogs, in which the content of pages is
   different for different sessions.  For statefull dialog support,
   other mechanisms (for example a session-id, cookie, or anonymous
   authentication mechanism that could be part of future versions of
   HTTP) should be used.


Alternative proposal:

  Instead of introducing a new Request-ID: header, include the 

     $session $request++

  information in the From: header.  Examples:

   From: (#342%33a4d443 12)

   From: "Roy T. Fielding" <fielding@beach.w3.org> (#342%33a4d443 12)


---snip---

Koen.

Received on Friday, 28 July 1995 17:46:02 UTC