- From: Koen Holtman <koen@win.tue.nl>
- Date: Fri, 28 Jul 1995 23:45:46 +0200 (MET DST)
- To: www-talk@www10.w3.org, connolly@w3.org
- Cc: koen@win.tue.nl (Koen Holtman)
To write the text below, I took proposal I. from <URL:http://www.w3.org/hypertext/WWW/Protocols/demographics.html>, and changed things according to issues (mainly connected to privacy and caching) discussed in the Session-id threads on www-talk. The text below can be seen as a personal summary of the parts of these threads that pertain to Request-IDs and privacy. ------snip---- The Request-ID: header field. Adapted from the proposal in <URL:http://www.w3.org/hypertext/WWW/Protocols/demographics.html>. Am HTTP request may include a header field of the form: Request-ID: $session $request++ e.g. Request-ID: 342%33a4d443 12 The HTTP client chooses a random string as a "session identifier", and each request in that session is identified by a number that increases monotonically with time. It is suggested that clients use a different random $session string for each server they talk to. This will make it more difficult for cooperating web service providers to match clicktrails in their logfiles, thereby getting user profiling information that is much more accurate than the user would want to give them without some form of compensation. Note that it is illegal to match logfiles under the privacy laws in some countries. The suggestion to use different $session strings can be seen as supporting these laws by making the crime of matching logfiles pay off less. A "session" is not formally defined (other than "a set of requests with the same $session id"), though I suggest that browsers begin a session when they are invoked and when they have been idle for 30 minutes or more, and allow some user interface to say "start a new session" (i.e. "choose a new random session ID"). Each user agent must provide a mechanism to turn the generation of Request-Ids off, especially for site security administrators that prohibit its use. If no Request-ID headers are present, this should be interpreted by web service providers as a statement that the user does not wish to reveal his or her exact clicktrail for privacy reasons. An attempt by service providers to silently obtain the clicktrail by some other means (for example by using a session-id, cookie, or anonymous authentication mechanism that could be part of future versions of HTTP), should be considered to violate the privacy wishes of the user. Whether HTTP clients use a global $request counter, or one counter for each server talked to, is up to the clients. HTTP clients which are not traditional user agents (e.g. multi-threaded robots) may use several sessions in parallel. A proxy must pass the Request-ID: header through unmodified. One might consider some sort of Proxy-Request-ID, though I doubt it would be valuable. An HTTP cache can assume that the response to an HTTP request does _not_ vary as a function of the Request-ID. That is, an HTTP proxy need not include the Request-ID in its "cache key." If the response to a request can vary, an Expires header should be used in the response to reflect this dynamism. It is preferred that the request-ID header is _not_ used to implement statefull dialogs, in which the content of pages is different for different sessions. For statefull dialog support, other mechanisms (for example a session-id, cookie, or anonymous authentication mechanism that could be part of future versions of HTTP) should be used. Alternative proposal: Instead of introducing a new Request-ID: header, include the $session $request++ information in the From: header. Examples: From: (#342%33a4d443 12) From: "Roy T. Fielding" <fielding@beach.w3.org> (#342%33a4d443 12) ---snip--- Koen.
Received on Friday, 28 July 1995 17:46:02 UTC