- From: Bob Wyman <bobwyman@medio.com>
- Date: Mon, 07 Aug 95 23:58:10 -0800
- To: "dmk@allegra.att.com" <dmk@allegra.att.com>, "http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com" <http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com>, "www-talk@www10.w3.org" <www-talk@www10.w3.org>
-- [ From: Bob Wyman * EMC.Ver #2.5.02 ] -- re: Dave Kristol's Session-ID proposal WHEN DOES A SESSION END? Your draft says that one of the "key points in the session paradigm" is that "The session has a beginning and an end." However, in your draft, it isn't apparent in all cases that both parties (or all parties if a proxies are involved) can determine when a session ends. You insist that "when the client terminates execution, it discards all Session-ID information." This, I assume, "ends" the session from the point of view of the client. However, the server will never discover that this has happened. Although you don't specify it clearly, the requirement that a client "must return [the Session-ID] to the server for the next transaction by any method ..." might be read to imply that the client can end a session by making another request without sending the Session-ID of the last response (if non- conformance is an implicit "end".). This would require the server to maintain a list of Session-ID's and the client hosts to which they were assigned and then doing a lookup in this list on *every* request to determine if the most recent request should have used a Session-ID. But, that won't work for clients that are hidden behind proxies or that come from multi-tasking machines since many clients may appear to be using the same host name. On the server side, you say that the server may send "...a different, or no Session-ID response header" in response to a request which includes a Session-ID. Although your draft doesn't explicitly say this, I assume you intend that when the client receives a Session-ID response which is different from that transmitted in the corresponding request, that the client should assume that the old session has ended and a new session may be starting. However, it is a method whose utility is limited to only those times when the session's end coincides with an opportunity to send a response. The fear, of course, is that since there are a number of scenarios in which the server can't discover that a session has ended, that the server will be forced to build an ever-growing list of active sessions. This is not good. One solution to this problem would be to provide for a session expiration date (either absolute or relative to last response) which would give the server a mechanism for purging it's Session-ID tables. (Note: Netscape's "cookie" proposal does something similar although expiring a "cookie" is somewhat different than expiring a Session-ID since one would assume that a Session-ID has some level of uniqueness -- not addressed in your draft ---, while there is no such requirement for a cookie.) WHAT IS A CLIENT? As far as I know, there is no formal reference model for the Web, thus, it is necessary from time to time to ask what people mean when talking about specific architectural elements. Seeing your requirement that client "discards all Session-ID information" when it terminates, and that the "client" must send the Session-ID on the next request I'm worried about imprecision in defining the client. For instance, if I have a Web browser that allows me to have two open windows (i.e. Netscape), if I get a Session- ID as the result of activity in window "A", am I required to send that Session-ID with requests generated in window "B"? If I close window "A", do I keep the Session-ID or delete it? WHAT IS A SERVER? Many of the demands for Session-ids or session-state have been intended to allow CGI scripts to distinguish between clients and to maintain client- specific state on the server side. This leads to the question (another reference model problem): What is the server? Is the session with the HTTP server or is it with the CGI script? Your draft indicates that you think the server is identified by "server name (IP address) and port combination." It seems to me that this means we're going to have to implement some potentially complex method for letting CGI scripts know the session ids for the clients they speak to. Choices seem to be: 1) a configuration parameter that tells the server to always generate Session-ID's when a particular CGI script is run. The Session-ID would then be given to the CGI in an environment variable or some simlilar process. 2) An API by which the CGI can ask the server to provide a Session-ID and tell the CGI script what it is. Additionally, given that you suggest that Session-ID's can be changed or eliminated by the server, we'll need a mechanism for CGI scripts and servers to negotiate and/or inform each other of these changes. Whatever the method of telling the CGI what the Session-ID is, unless the specification states that Session-ID's have some sort of uniqueness to them, they won't be useful for many of the purposes that CGI scripts would want to use them. My personal preference would be for the CGI to be able to generate its own Session-ID which addresses whatever it thinks its operational requirements are. Thus, I would argue that the CGI script is the "server", not the HTTP daemon. ARE WE DEFINING PROTOCOL OR USER INTERFACE? Your requirement that the client discard information when it "terminates execution" might be a good recommendation, however, it is inappropriate as a *requirement* of the HTTP Protocol. The protocol should place no constraints on program execution models -- only on what data flows over the wire. PROBLEMS WITH CACHING You state that when a caching proxy gets a Session-ID response header, "it must not cache that header as part of its cache state." However, you don't prohibit the caching proxy from caching the body of the response. Thus, it would appear that a second client could make a request and have that request satisfied from cache without ever discovering that Session-ID's were available for the document. This would be a particular problem when a response came back with both a Session-ID and an Expires: header since the cache might decide not to do a HEAD, GET, or conditional GET until after or close to the Expire time. It would seem that the cache should remember that the Session-ID response header was there (whether or not it caches the actual Session-ID) and then always do a conditional GET for the document even if the Expires: time hasn't passed. (NOTE: Of course, you could insist that the semantics of Expires be changed if found coincident with a Session- ID in a response. -- the question is whether the "Expires" header is globally meaningful or meaningful only within the context of the session) WHAT PROBLEMS ARE BEING SOLVED HERE? The draft doesn't really give any information about the specific uses that are expected for the protocol features defined. This makes it hard to evaluate. For instance, I can see that what you have defined would be useful in tracking "clickstreams." However, the problems mentioned above and others not mentioned make it hard to see how this proposal will help with "shopping carts" and a variety of other applications identified in the recent www-talk discussions on this subject. bob wyman
Received on Tuesday, 8 August 1995 03:06:52 UTC