Re: Caching data returned from POST, and conditional POST from Shel Kaphan on 1995-12-31 (ietf-http-wg@w3.org from October to December 1995)

From: Shel Kaphan <sjk@amazon.com>
Date: Sun, 31 Dec 1995 14:39:26 -0800
To: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com, Brian Gaines <gaines@cpsc.ucalgary.ca>
Message-Id: <199512312239.OAA04193@bert.amazon.com>
Brian Gaines writes
 Shel Kaphan writes
 > Brian Gaines writes
 > > I noticed that Netscape (2.0b4) sends a conditional POST when data
 > > cached that was returned from a POST expires.
 > >
 >
 >What does that mean?  A POST with if-modified-since?  I don't believe
 >this is has been defined, and it doesn't really make sense unless the
 >if-modified-since only applies to the returned data, not to the action
 >of the POST, since POST can have side effects at the server, so doing
 >doing it twice is not the same as doing it once.
 >

 I think it is clear what it means. The server should be able to identify
 the data it sent back from a POST has changed in the same way as it does
 that from a GET. It is up to the server to keep track of its state
 changes in doing this.

So far, the only conditional allowed in HTTP is GET with
if-modified-since.  Perhaps you're referring to the fact that in
response to BACK and FORWARD commands, Netscape allows the user to OK
a new send of a POST when it no longer has the data (or when it has
expired, which is against the rules in the spec, by the way).  Please
take a look at the description of the Expires header in the spec.
There's a short disclaimer about interactions with browser history
functions.


 As a general comment on your responses, I think it inappropriate for
 the HTTP protocol itself to attempt to take account of server state.

I agree, except insofar as there are already necessary conventions
which are somewhat implicit and have to do with caching, as I already
mentioned (GET & HEAD are "servable from a cache", other methods,
not).

 It should provide the means for the server to control the browser so
 that it can manage user interaction properly, but should not attempt
 to enforce arbitrary models of what is a general client-server system.

Did I seem to suggest otherwise?

 There are many different ways of handling state and contention in
 client-server systems and the appropriate mechanism varies widely
 according to the application. The HTTP specification should allow
 the system designer to design the most effective system. It should give
 control over caching, not say it should not occur when it clearly
 should and clearly does.

Yes, however you might want to take a look at the various state-management
proposals floating around.  For instance, the Netscape "cookie"
proposal, and Dave Kristol's "state-info" proposal.  There is likely
to be some convergence between these two approaches, and it is likely
that the protocol *will* address state handling, but of course will
not enforce that you use that mechanism if you want to do it some
other way.

 > An end user sees no difference between them and expects
 > > the "Back" and "Forward" buttons to act in the normal manner in retrieving
 > > cached data. That is, a browser MUST cache the results of a POST if it is
 > > to behave correctly.
 > >
 >
 >No, you're confusing history functions of the browser with caching.
 >BACK and FORWARD can show you documents that are stored by your
 >browser, but are not necessarily the same set of documents as are in
 >your cache.  For instance, your history list may contain several
 >versions of the same document, e.g. a stock quote report you've asked
 >for every 15 minutes for the past several hours.  In addition,
 >although these documents are in your history list, some or all of them
 >may be "stale" (marked as "expired"), but BACK will show you them
 >anyway -- or should!  The cache comes into play when you make a fresh
 >request, by entering a URL, clicking on a hyperlink, or submitting a
 >form.  In this case the cache decides if it can simply present the
 >document it already has at hand, or if it must fetch a new copy.  POST
 >requests always have to go through to the origin server, because of
 >the possibility of server side effects.  Of the existing methods in
 >HTTP, only GET and HEAD can be served from a cache without the
 >necessity of contacting the origin server.  OTOH, the headers in the
 >resource returned by GET or POST are sufficient to control whether
 >that returned resource can be "cached" -- stored into a cache so that
 >later GETs can get that document from the cache.  (Again, this last
 >statement is apparently not fully agreed upon).

 What you describe would be very bad practice.

Which thing I describe?  Except for the very last point, this is what
the current spec calls for.  The problem, I feel, is that there are
multiple incompatible interpretations floating around of the WWW
"computing model" as enshrined in both browsers and protocols.  

 I want the browser to
 check with the server when the user uses "Back" and "Forward" so that
 the server can ensure that the data is up to date. Netscape does precisely
 this and enables the user interface to be kept in sync with the state
 of the server.

Again, please look at the description of Expires in the spec.  You
might also want to look at Koen Holtman's and my report from some
months ago in http://www.amazon.com/expires-report.{html,txt} which
discusses some of these issues.

 >Don't get me started on hidden fields.  They're about the worst way to
 >handle state.  First off, not all browsers hide them, and some of
 >those that don't make them editable, guaranteeing that some loser will
 >edit them.  Secondly, precisely because using BACK loses this state,
 >it tends to be confusing to users.  If hidden fields must be used at
 >all for state handling, they should only be used extremely locally -- from one
 >page to the next.
 >

 We have major interactive applications that use hidden fields to carry
 state data effectively. I detailed the advantages of this in the
 "Porting" tutorial at WWW4 <http://ksi.cpsc.ucalgary.ca/articles/WWW/PortWeb>.

I'm familiar with the approach -- to the extent I have tried to use
it, I have had to back out for the reasons mentioned.  On the other
hand, I'm not trying to stop you from using it if you want to, and
there's nothing in the spec that talks about it, nor should there be.

 The major browsers all deal correctly with hidden fields and minor ones
 that do not will presumably be debugged to do so.

There's enough users of minor ones that the commercial services I have
worked with have had to eliminate all uses of hidden fields due to data
corruption caused by inadvertent user editing, because the mistakes
are too expensive to clean up after the fact.

 Storing state in hidden fields make it possible for "Back" to act as "Undo"
 which is what users expect.  When the "Undo" is impossible or ambiguous
 the server can query the user in the normal way for a transaction-processing
 system.

Sorry, but most users do *not* equate the BACK button with "undo", and
in fact, most naive users don't know there's a difference between a
link that says "go back" and using the browser's BACK button, and to
the extent it is possible to preserve the lack of requirement for
users to know how these things work, I think it should be preserved.

 I guess the generic point is to avoid building specific TP architectures
 into HTTP. It is a general transport and control protocol capable of supporting
 many TP architectures.

I agree.

 b.

 Dr Brian R Gaines               Knowledge Science Institute
				 University of Calgary
 gaines@cpsc.ucalgary.ca         Calgary, Alberta, Canada T2N 1N4
 403-220-5901  Fax:403-284-4707  http://ksi.cpsc.ucalgary.ca/KSI


cheers, happy new year, 

Shel Kaphan
sjk@amazon.com
Received on Sunday, 31 December 1995 14:46:15 UTC