Re: two-phase send concerns from Jeffrey Mogul on 1995-12-07 (ietf-http-wg@w3.org from October to December 1995)

From: Jeffrey Mogul <mogul@pa.dec.com>
Date: Thu, 07 Dec 95 12:31:37 PST
To: "Roy T. Fielding" <fielding@liege.ICS.UCI.EDU>
Cc: http working group <http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com>
Message-Id: <9512072031.AA00870@acetes.pa.dec.com>
Since I was unable to attend the IETF meeting (to be honest,
I was also unwilling to go to Dallas), I am only vaguely
aware of the problem that the two-phase approach is trying
to solve.

Apparently, the problem is that a server may want to refuse
to accept a large entity body from a client, to avoid being
"deluged by bytes."  Is this the correct (and only) problem
that two-phase send is trying to solve?  If not, would someone
please define the problem?  Anyway, I'll assume that I have
the right definition.

In other contexts, people make the distinction between "optimistic"
and "pessimistic" approaches.  (For example, concurrency control,
parallel simulation event processing, etc.)  Optimistic approaches
assume that everything will go well, and then if things don't go
well, you pay something to fix the damage.  Pessimistic approaches
assume that things will go badly, and always pay something to
protect against this ever happening.  Usually, pessimistic
approaches pay less overhead for "good events" than optimistic
approaches pay for "fixing up after bad events", but if bad
events are rare, then the optimistic approach is still a net win.

Roy has taken a pessimistic approach to the byte-deluge problem.
In particular, every use of every method with a two-phase requirement
must pay either the arbitrary timeout period OR at least one
round-trip time.  In other words, by taking this approach, we
build in "extra" delay to every POST (etc.) invocation for millions
of users for many years.  Hmm.

The optimistic approach would be more along the line of "if the
server decides that the entity body is too large, it can do something
fairly expensive to solve the problem".  Ideally, the "solution"
would include making sure that the client does not continually
retry its large POST (or whatever).

The optimistic approach makes sense if we expect that most of the
time, servers will accept the POST (or whatever), and so removing
the extra RTT is a net win, even if it makes someone's life harder
in those few cases where a POST is large *and* the server doesn't
want to see it.  Since we presumably want to avoid congestive collapse
of the server, we ought to look for a solution that forces the costs
of this case onto the client.

David Morris suggests (apparently following a suggestion from "Alex"):
> If there needs to be a timeout to protect poor implementations,
> might it not be better to restrict the server from closing the connection
> for some interval after sending the abort. The server can leave the
> data unreceived or otherwise close the receive window but leave the
> connection open.

And Roy responds:
    If someone would test that theory on real systems and see if it
    works, that would be a big help.

One part of the theory, at least, is true by the definition of TCP
(and all other reliable transport protocols that I know about):
the receiver can avoid being deluged by data simply by not accepting
anything, since the flow-control mechanisms are specifically designed
for this purpose.  Some possible ways to accomplish this are:
	(1) use a small or zero receive window
	(2) don't do any read()s from the input stream
	(3) close or reset the connection.

So we have a way to avoid the deluge for the current connection.
We need two other things:
	(A) some way to ensure that the client doesn't simply
	try again with a large POST.
	(B) some way to indicate to the user that the operation
	isn't allowed.

Unfortunately, I don't expect that the other part of the theory is
correct.  I.e., that it would work reliably to simply keep the
connection open for some arbitrary timeout, after sending an abort.
This is because I suspect that some clients which simply write() the
entire POST body before bothering to read() any response will simply
block.  No matter how long the server holds the connection open, the
client will never read the response.  (We could try to have the spec
insist that a conforming 1.1 client must always be looking for
asynchronous responses, but I doubt this would fly.)

So here's an alternate suggestion (mostly as a way to start people
thinking about additional suggestions, although I think it would
work).

	If the server doesn't want to receive the large body,
	it immediately replies with its 4xx or 5xx response,
	and immediately closes (not resets) the connection.

	If the client manages to read the 4xx or 5xx response,
	it must honor it and should reflect it to the user.

	If the client simply sees the transport connection
	disappear, then the language in section 1.4, "Overall
	Operation",
	   Both clients and servers must be capable of handling cases
	   where either party closes the connection prematurely, due to
	   user action, automated time-out, or program failure. In any
	   case, the closing of the connection by either or both
	   parties always terminates the current request, regardless of
	   its status.
	applies with one variation:
	   if the aborted request is a POST, PUT, PATCH, etc. (i.e.,
	   any request with a non-empty entity body) then if the
	   client repeats the aborted request, it MUST include
	   the request-header:
		Connection: two-phase
	   and then follow Roy's two-phase protocol: wait for
	   a response or for an specified timeout period before
	   proceeding.

	A client MAY use
		Connection: two-phase
	at other times, on its own initiative, but MUST use it
	when repeating a request because of a prematurely-closed
	connection.

We might also want to do this:
	Servers rejecting an entity-body for reasons of size
	must respond with 
		413 Request too large

	Once a client has received a 413 response from a server,
	it SHOULD always use
		Connection: two-phase
	and the two-phase protocol with that server in the future.
	This is actually not a big cost, because we know that
	a server which has sent us a 413 response is an HTTP 1.1
	server, and so the client should (almost?) never need to wait
	much longer than one RTT for the two-phase mechanism
	to complete.

Several people have pointed out that 5 seconds seems like a
poor choice for a time.  It appears to be a compromise between
"give the server enough time to respond" and "don't delay too
long waiting for HTTP 1.0 servers, which won't send a Continue
response."  But in the optimistic approach, we only invoke the
N-second wait in the (hopefully) rare case of a prematurely
closed connection.  So N could be somewhat larger than 5 (say,
10 or 20) without impacting the normal case.

-Jeff
Received on Thursday, 7 December 1995 12:42:59 UTC