Re: [Fwd: I-D ACTION:draft-decroy-http-progress-00.txt] from Adrien de Croy on 2007-02-12 (ietf-http-wg@w3.org from January to March 2007)

From: Adrien de Croy <adrien@qbik.com>
Date: Tue, 13 Feb 2007 12:49:51 +1300
To: ietf-http-wg@w3.org
Message-ID: <45D0FD1F.2090701@qbik.com>
hi

thanks for your comments, I have a few more below.  It's interesting to go
through the issues.

Henrik Nordstrom wrote:
> tis 2007-02-13 klockan 08:51 +1300 skrev Adrien de Croy:
>
>> The 100 continue is the intended solution to the problem, and whilst I 
>> can see
>> how it could be effective in direct client-server communications, there are
>> issues with it when there are intermediaries or delays.
>
> Not really. 100 Continue is end-to-end, not hop-by-hop. Just as your
> proposed "Defer" status.
That wasn't my intention.  100 continue in practice can be hop by hop,
as a proxy can send it back
to a client (or is this prohibited?).

>
> Your draft adds the new 1xx response not for flow control but for
> terminating flows without terminating the connection. This is soft
> abortions of request entities, not flow control.
Actually it's not intended to abort it but make the client wait for a
100 before sending, so it's a
"please wait for 100 before continuing" response rather than an abort.

This is purely to solve the issue of the client not knowing otherwise
how long to wait for a 100 continue.

Once a client receives the "please wait for 100 before continuing" then
it knows not to send request body
until it receives a 100 continue or completion (i.e denied, auth
required etc)

>
> The proposed approach in the draft have at least two major flaws making
> it unsuitable:
>
> a) There is no guarantee the client will actually wait for the 100
> Continue response before sending the request body, especially not when
> there is intermediaries involved which may significantly delay the
> request or response, so the server may get a response body even after
> sending the "abort" signal and this may then get misread as a different
> request (i.e. a PUT of a object containing an HTTP request).
any 1xx signal should be interpreted as an interim result?  So it
wouldn't matter if a client ignored it as
part of the initial request, or ignored it as part of a subsequent request?

the whole problem is that there is no guarantee that a client will wait
for 100 continue.  If we could guarantee that
I wouldn't have needed to write this spec.

The purpose of the spec is to force compliant clients to wait  for a 100
continue.
A client receiving the defer response then KNOWS that it WILL receive a
100 continue or a final result
instead of having to guess at how long to wait.

>
> b) And since you are significantly changing the message formatting rules
> based on end-to-end communication and not hop-by-hop you are guaranteed
> to cause problems when there is intermediaries involved. For example
> proxies MUST forward unknown 1xx responses, and this would change the
> message format under the feets of the proxy without it knowing causing a
> real mess for the proxy.
Actually where I am coming from is proxy development, and the reason I
wrote the spec was to deal with
issues in proxy implementations relating to flow control.

The proxy that doesn't understand a 10x defer response and passes it
through won't notice anything different from
any other 1xx response.  It is the behaviour of the client that this is
intended to modify

So I'm not sure I understand your point here - if the client treats any
1xx response as possibly multiple (which it
already should), and the proxy forwards everything, then I don't see the
problem.

>
> Both the above problems is related, but from different aspects of HTTP
> and at different locations in the request forwarding path.
>
> To guard from this you could in theory add a new "Expect: 1xx Defer"
> condition with the added side effect that the client guarantees that it
> will not transmit the request body until an 100 Continue is seen and
> will instead close the connection and send the request again if 100
> Continue is not seen in a timely fashion.

The point is that whatever sends the 1xx defer must guarantee to either

a. send a 100 continue; or
b. send a completion / auth challenge / denial; or
c. close the connection.

so the client MUST wait forever or give up.

As for Expects, I've never seen an "Expects" header.  I think no-one
uses them because the concept is broken.

Sending an Expects header is like putting your head on the block, and
saying "please don't chop my head off"

Because if the server doesn't support it, your request is toast, and you
are going to have to:

a) resubmit the request
b) store information about that server in an database so you don't try
and send an Expects header back
to that server again etc. etc etc.

There's no mechanism in HTTP where a client can apriori evaluate the
capabilities of a server.

So any client developer is going to look for alternatives before
resorting to the Expects header, and will only
add it once there is wide-spread support in it from a majority of
deployed servers (years away).  Even then,
use of the Expects header in a browser is likely to result in only one
thing: tech support calls to your tech support
saying:

"why does your browser have to make each request twice when connecting
to xxxxx?"

>
> In my eyes, so far the chunked encoding approach looks the most
> promising way to solve the authentication issue in a reasonable manner,
> even if it means loosing the information on the request size.. And
> loosing the information about the request size is frankly the only real
> drawback of the approach (broken implementations set aside).
I think most software developers or support staff may disagree.
Implementation difficulty places a large
burden on

a) developers
b) customers (dealing with inevitable bugs).

First day in engineering school they teach the KISS principle.
>> a. issues when a proxy is connecting to an HTTP/1.0 server.  Unless it 
>> knows
>> apriori that the server is HTTP/1.1 compatible it can't send chunked 
>> resource anyway.
>
> Correct. But not really a big problem. 100 Continue solves the ugly part
> of this and for authentication as it allows the client to reasonably
> probe the server without sending the request body before it knows it
> will get accepted by the server.
>
> This whole thing is only a significant problem for NTLM (and Negotiate)
> authentication as it can not close the connection on authentication
> challenges and therefore MUST transmit the request body which will be
> dumped to the bitbucked by the server..
actually the mechanism that triggers the problem is the delay in the TCP
connection between the
proxy and the upstream server.  Often we see the client sending the
message body before the
proxy has even connected.  There's no chance for 100 continue to do
anything useful there.

If a proxy could send a 1xx defer however, there could be something
useful done, and this would
benefit all auth schemes.
>
>
> On the initial request when the HTTP level of the next-hop is unknown
> the client must use Content-Length, and will from the response learn if
> the path is HTTP/1.1, if not neither approach of short-circuiting the
> request body can be used and the client MUST resend the request body as
> is done today.
yes, problem is maintaining a cache of server behaviour is costly.

>
> To avoid transmitting the request body on the initial request the client
> has to close the connection if seeing an authentication challenge or
> other error.
yes, which breaks session-based auth.

Some may not care about NTLM, since it is MS and non-standard and all,
but there are several dozen
million HTTP users who do care about it, and they have to bear the brunt
of design decisions made in
this protocol.

>
> In the next attempt in sending the request (possibly with updated
> credentials) the client should connect to the same next-hop and assume
> an HTTP/1.1 capable path if the last response indicated it's an HTTP/1.1
> path. In these conditions it knows for certain that the next-hop is
> HTTP/1.1, and that quite likely the whole request path is.
that depends on the proxy.  If the proxy always responds with the HTTP
version of the request, then
the upstream path capability remains unknown to the client.

having to make a next attempt is also costly.

>
> As long it knows the next-hop is HTTP/1.1 it's always safe to send
> chunked encoding. 
I'd argue in general since there's no tried and tested real-life
browsers out there that this isn't safe at all.

> In worst case the request may get aborted with a 4xx
> forcing the client to fall back to Content-Length and resending the
> request body in each roundtrip.. And chunked encoding is a requirement
> for the client to clearly indicate to the next-hop that the body has
> been terminated avoiding the problems indicated above.
>
> To avoid frequent 411 responses it's probably best for the client to
> first send the request with Content-Length and Expect: 100-continue
> unless it's known the last path used to that server is fully HTTP/1.1
> and that authentication is quite likely needed to finish the request.
> Yes, this costs one TCP connection for the initial probe, but for most
> uses the probe will not be needed as the server and next-hop is both
> already known to the client.
see comment above about server capabilities being masked by the proxy.
In the end in many cases
the only thing the client is sure about is the capability of the proxy,
which doesn't help.

>
>> Clients are in the same boat, and that's why I think there aren't any 
>> (that I have found) that send chunked data, since they would need to
>> maintain a database of internet webservers to keep track of what
>> server supported chunking or  not - a fairly low-return on investment.
>
> Client's dont use chunked encoding today as there hasn't been much
> benefit for them to do so.
>
> The NTLM authentication mess is a good reason why to use chunked, and to
> care to implement the HTTP/1.1 probing/ needed to do it safely..
>
>
> Just because the "Defer" status code seemingly looks like it may
> initially be less lines of code to implement in existing products does
> not make a broken approach a good approach. 
I think the broken approach was to try and create flow control with only
half the minimum signals
required.  This is an attempt to fix that.  I've been watching problems
like this with HTTP for
12 years now.

any layman would laugh at you if you proposed installing traffic lights
at intersections which could
be either green or not there, and you never knew whether it would ever
turn green.

That's what HTTP 100 continue is though.

> In the end you'll end up
> with about the same amount of code plus much stricter requirements on
> when it may be used as it needs all intermediaries to support the new
> feature for it to be used reliably and it's also plagued by exactly the
> same "next-hop status unknown" issues.
next-hop unknown is one hop less to worry about however, since the proxy
that supports this
can already provide major benefit to the end user.

I know from implementing RFC extensions to a myriad of protocols that
I'm much more likely to
implement something
a) at all
b) reliably

if it is simple.  Requiring a client to produce chunked resource data
when the developer knows it's
most likely going to encounter a broken proxy or server is going to
encounter some stiff resistance.

>
>> b. loss of information on which to base policy decisions.  Unless you 
>> can set the
>> content-length field as well?
>
> You can't.
>
thought so - thanks for the confirmation.

>> c. implementation complexity -> compatibility issues with non-compliant 
>> clients
>> servers and intermediaries.  An additional status code for a client to 
>> see is fairly
>> low-impact, compared to servers and proxies suddenly seeing chunked 
>> resource
>> from a client.
>
> True. As chunked encoding is rarely used in requests it hasn't been
> tested much and there quite likely is broken implementations out there.
> But I think it's safe to say that most if not all HTTP/1.1
> implementations will simply abort the request with a 411, at least if
> it's a PUT/POST request.
sure, which leaves us no worse off than we are now I guess.

>
>> Protocols have had 2 signals for flow control since year dot.  RS232 had 
>> RTS/CTS
>> Xmodem had X-on/X-off.
>
> I don't see how this compares to your proposed extension. The proposed
> "Defer" status is not a flow control, it's an abort condition.
>
no, it's a defer - a X-off see initial comments, the connection or
transaction
isn't aborted at all, it's deferred.

> X-on/X-off is in HTTP "Expect: 100-continue" and "100 Continue", plus
> all the transport flow control on top of which HTTP runs.
X-on is in there, X-off is not.

Regards

Adrien

>
> Regards
> Henrik
Received on Monday, 12 February 2007 23:50:03 UTC