Re: comments on draft-ietf-http-state-man-mec-04.txt from Dave Kristol on 1997-11-14 (ietf-http-wg@w3.org from October to December 1997)

From: Dave Kristol <dmk@research.bell-labs.com>
Date: Fri, 14 Nov 97 17:20:35 EST
To: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Cc: fielding@kiwi.ics.uci.edu
Message-Id: <9711142220.AA02181@aleatory.tempo.bell-labs.com>
"Roy T. Fielding" <fielding@kiwi.ics.uci.edu> wrote on Tue, 11 Nov 1997 16:19:03 -0800:

  > Larry asked me to do a detailed review of the draft, which is probably
  > a dangerous thing.  I do not consider myself to be a Cookie expert;
  > in fact, I avoid them like the plague.  My comments are on general
(Good judgement!)
  > draftyness issues.

Thanks for the extensive comments.  (Turn about is fair play? :-)

  > 
  > >                 HTTP State Management Mechanism (Rev1)
  > 
  > I'd prefer not to have " (Rev1)" in the title.

I thought it was desirable to distinguish this from RFC 2109.

  > 
  > >2.  TERMINOLOGY
  > >
  > >The terms user agent, client, server, proxy, and origin server have the
  > >same meaning as in the HTTP/1.1 specification [RFC 2068].
  > >
  > >Host name (HN) means either the host domain name (HDN) or the numeric
  > >Internet Protocol (IP) address of a host.  The fully qualified domain
  > >name is preferred; use of numeric IP addresses is strongly discouraged.
  > >...
  > 
  > A comment should be included regarding support (or non-support) of IPv6
  > addresses [we got pinged on this for the URI draft].

Can you suggest wording?  I wanted to avoid specifying syntax for either,
and I chose the neutral term "numeric IP addresses" to (try to) imply both.
(I also seem to remember that the proposed text representation of IPv6
addresses conflicts with URI syntax....)

  > 
  > In general, I find the terminology section more confusing than useful,
  > even though I do know what is intended.  I find it unlikely that anyone
  > not involved in the mailing list discussions would have a clue as to
  > what is being described.  It would be better to postpone the description
  > of hostname dot comparison until it can be described as part of the
  > matching algorithm.

YMMV.  I prefer to put the terms in one place so someone can find them
easily if they need to.  It's a bit like the HTTP specification -- you
can't really understand this section until you understand the whole thing.

  > 
  > >4.  OUTLINE
  > 
  > Why is the entire protocol specification under the section "OUTLINE"?

Yeah, I guess that's pretty silly.  I'll think of a better heading.

  > 
  > >We outline here a way for an origin server to send state information to
  > >the user agent, and for the user agent to return the state information
  > >to the origin server.  The goal is to have a minimal impact on HTTP and
  > >user agents.  Only origin servers that need to maintain sessions would
  > >suffer any significant impact, and that impact can largely be confined
  > >to Common Gateway Interface (CGI) programs, unless the server provides
  > >more sophisticated state management support.  (See Implementation
  > >Considerations, below.)
  > 
  > CGI has nothing to do with it.  "particular resource namespaces" would
  > be more accurate.  ", unless the server provides more sophisticated
  > state management support" says nothing.

When I first wrote this section, about 20 months ago, I was trying to
anticipate concerns that implementers might have about how much of a
burden cookies would be to running a server.  I was trying to make a
point that the support was relatively localized.  In "the old days",
cookies got handled by CGIs, and only those CGIs that needed cookies
needed to support them.  Newer servers may provide built-in support
("more sophisticated...") for cookies, but that support isn't essential
to their support.

Your description is correct from the HTTP point of view, of course.

  > 
  > [...]
  > >A user agent returns a Cookie request header (see below) to the origin
  > >server if it chooses to continue a session.  The origin server may
  > >ignore it or use it to determine the current state of the session.  It
  > >may send back to the client a Set-Cookie2 response header with the same
  > >or different information, or it may send no Set-Cookie2 header at all.
  > >The origin server effectively ends a session by sending the client a
  > >Set-Cookie2 header with Max-Age=0.
  > 
  > A user agent never returns a Cookie request header field --
  > it *sends* a Cookie request header field containing a field value
  > matching that of a previously received Set-Cookie response header field.
  > Likewise, a server never "sends back" a Set-Cookie2 response header field;
  > it either sends a Set-Cookie2 or it doesn't.

You're thinking of it from the HTTP perspective, and I'm describing it
from the application perspective.  That's why I was (or tried to be)
careful about using the terms "request header" and "response header" to
identify the HTTP role, as distinct from the application role.

  > 
  > Much of this discussion is unnecessarily confusing.  I think it would be
  > better to define the cookie exchange protocol as a state machine,

Perhaps.  (The thought of doing ASCII art for both .txt and .ps
intimidates me a bit.)

  > rather than enumerate in prose all of the optional behavior of each party
  > at any given time in the exchange.  For example, I find the TCP specification
  > to be less confusing even though it describes a much more complex state
  > machine than is implicit in the cookies protocol.

Perhaps you noticed I tried to present the behavior in a
component-centric way.  That means the server-specific behaviors are in
4.2.1, and the user-agent-specific behaviors are in 4.3.  I don't give
a consolidated overview of the cookie protocol, which I think is what
you would prefer.

  > 
  > >4.2.2  Set-Cookie2 Syntax  The syntax for the Set-Cookie2 response
  > >header is
  > >
  > >set-cookie      =       "Set-Cookie2:" cookies
  > >cookies         =       1#cookie
  > >cookie          =       NAME "=" VALUE *(";" set-cookie-av)
  > >NAME            =       attr
  > >VALUE           =       value
  > >set-cookie-av   =       "Comment" "=" value
  > >                |       "CommentURL" "=" <"> http_URL <">
  > >                |       "Discard"
  > >                |       "Domain" "=" value
  > >                |       "Max-Age" "=" value
  > >                |       "Path" "=" value
  > >                |       "Port" [ "=" <"> 1#portnum <"> ]
  > >                |       "Secure"
  > >                |       "Version" "=" 1*DIGIT
  > >portnum =       1*DIGIT
  > 
  > Unless you wish to define the precedence of "|" in your BNF, each
  > of the set-cookie-av need to be grouped in parentheses.

Obviously the typography dictates the precedence. :-)  Seriously, do
you think there's anything confusing in the current format?  I fear that
adding ()'s will make it harder to understand, not easier.  Also,
traditionally, alternation is lowest precedence after '=', no?
  > 
  > >The origin server should send the following additional HTTP/1.1 response
  > >headers, depending on circumstances:
  > >
  > >   * To suppress caching of the Set-Cookie2 header: Cache-control: no-
  > >     cache="set-cookie2".
  > 
  > This type of example should not be allowed to break across a line.
  > There are many other areas where the text translation needs to be
  > fixed for a final draft.

I agree.  It's strange -- I've got hyphenation disabled in the (nroff/troff)
source.  I guess that only controls intra-word hyphenation.  (I wonder if I
can even persuade nroff/troff not to break around '-'.)

  > 
  > >HTTP/1.1 servers must send Expires: old-date (where old-date is a date
  > >long in the past) on responses containing Set-Cookie2 response headers
  > >unless they know for certain (by out of band means) that there are no
  > >upstream HTTP/1.0 proxies.  HTTP/1.1 servers may send other Cache-
  >  ^^^^^^^^
  > Actually, that is downstream.  It is better to call it "proxies in
  > the response chain".

:-)  I think I had "downstream" originally, then changed it to "upstream"
in response to a comment from Yaron Goland.  I like your wording better:
I don't have to know up from down. :-)

  > 
  > >6.  IMPLEMENTATION CONSIDERATIONS
  > >
  > >Here we speculate on likely or desirable details for an origin server
  > >that implements state management.
  > 
  > Kinda weak statement.  My inclination is to delete things that don't
  > say anything useful.

Okay.

  > 
  > >6.1  Set-Cookie2 Content
  > >
  > >An origin server's content should probably be divided into disjoint
  > >application areas, some of which require the use of state information.
  > >The application areas can be distinguished by their request URLs.  The
  > >Set-Cookie2 header can incorporate information about the application
  > >areas by setting the Path attribute for each one.
  > >
  > >The session information can obviously be clear or encoded text that
  > >describes state.  However, if it grows too large, it can become
  > >unwieldy.  Therefore, an implementor might choose for the session
  > >information to be a key to a server-side resource.  Of course, using a
  > >database creates some problems that this state management specification
  > >was meant to avoid, namely:
  > >
  > >  1.  keeping real state on the server side;
  > >
  > >  2.  how and when to garbage-collect the database entry, in case the
  > >      user agent terminates the session by, for example, exiting.
  > 
  > Ugh! Can we just delete section 6.1?  If not, then you need to be succinct
  > in describing the scope of design decisions and their relative trade-offs.
  > I.e., "Cookie Length", "Namespace Allocation", and "Direct vs Indirect State
  > Identification" are all design considerations with separable trade-offs,
  > and thus should be discussed individually (or not at all).  Likewise,
  > a discussion of when cookies should be avoided in favor of one of the
  > other alternatives would belong in the introduction.

I was just trying to point out some of the design considerations.
Because the various tradeoffs are obviously implementation-dependent, I
didn't want to (nor did I think I should) go into them in detail.  The
question is whether the draft would be better off without the section,
or with the admittedly meager comments that are there.  I thought it was
at least worthwhile to point out the issues.

Dave Kristol
Received on Friday, 14 November 1997 14:23:24 UTC