Re: Upload negotiation from Henrik Nordstrom on 2008-04-08 (ietf-http-wg@w3.org from April to June 2008)

From: Henrik Nordstrom <hno@squid-cache.org>
Date: Tue, 08 Apr 2008 14:10:19 +0200
To: Adrien de Croy <adrien@qbik.com>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Message-Id: <1207656619.31831.105.camel@HenrikLaptop>
tis 2008-04-08 klockan 22:41 +1200 skrev Adrien de Croy:
> right - but then that only works with HTTP Basic auth and Digest.  NTLM 
> users are out in the cold because their auth method requires a 
> persistent connection.  It's a lot easier to write "ignoring connection 
> oriented" than it is to ignore all those customers.

Using the chunked method works just fine for NTLM.

> It doesn't help either that many browsers don't support receiving any 
> data whilst trying to send resource, so won't stop sending even if you 
> do send back a 4xx.

That's a bug/misfeature. Not someting we can help by adding more
features to the protocol.

> I gues contravening your spec reference below.  I 
> understand the market-leading browser does this.

So someone should beat them with a big stick then...

> I guess that's what stands HTTP apart from pretty much all other 
> transfer protocols (SMTP, FTP, NNTP etc).  By optimising the protocol so 
> much to minimise round-trips, we sacrificed certain protocol 
> pleasantries that dealt with these sorts of situations.  When we then 
> come up against the issues, we have to specify heuristics to get around 
> it, and to deal with backward compatibility.  It makes me wonder what 
> HTTP will look like in 30 years.

The day we can forget HTTP/1.0 things will stabilize considerably.

But in 30 years I expect that HTTP has been pretty much replaced by
something new, more targeted for interactive transfer of large amounts
of data.

> I like the sending the package at the post office analogy, but try it 
> with 2000kg of sand, and you live on an island and see what the 
> logistics are like.  Esp when the small local post office wants to see 
> some id, which you have to go back home for, and then front up with 
> another 2000kg of sand when you come back again with your ID, then they 
> say they don't accept packages that big.

Only if you refuse to do what the specs says you should do.

Specs says you should contact the post office, say I want to send 2000kg
of sand and then wait for confirmation. The twist is that if there is
neither a confirmation or rejection within a reasonable time frame and
you don't know if the post office or your communication channel to the
post office supports confirmation then you should not wait forever and
instead assume it's acceptable if no response is seen within a
reasonable timeframe.

Specs absolutely do not say that you SHOULD drive to the post office
with all that 2000kg of sand blindly assume they will handle it for you
by default.

> Or worse still then they refer you to the central post office on the 
> mainland, which is only accessible by a small rope swing bridge which 
> you have to carry this 2000kg over in small bags.  Then they send you 
> home for ID as well.

Stop assuming that you have to carry those 2000kb, it's a false
assumption. You do not have to. You can select to but it's your own
choice.

> And even with Digest auth and 2 entities wanting auth, you are looking 
> to transfer 4 times or drop connections.  I guess that's the beauty of 
> digest - you don't need to maintain the connection.

Just follow the specs and you will be fine.

With NTLM it's harder to follow the specs, but thats not the specs
fault. NTLM should have been implemented at the message level, not
connection. Digest is one source of inspiration on how a such session
oriented authentication scheme may look like without tying it to the
transport.

> But this really doesn't help so many people who are stuck with NTLM for 
> any number of reasons.

Until something better can replace NTLM/Negotiate.

> And 100 continue is just like turning up to the post office with this 
> sand on the truck with the removal guys you are paying $40/hr for, and 
> if someone doesn't come out of the post office within 2 seconds, you 
> start unloading.

Now you are just too pessimistic about 100 Continue. Most web servers
out there is HTTP/1.1 and do send 100 Continue (even old ones just
implementing RFC2068). And the major commercial proxy servers it
HTTP/1.1 as well. Squid is not, but that's another story..

> And once you start unloading, you don't stop until 
> it's unloaded (what many browsers do).  100 Continue + Expects is the 
> same except you toot the horn outside before waiting for 2 seconds then 
> starting unloading.

I don't view it that way. I would view it that you either first send the
request in a separate car or phone it in. Then at a suitable time send
the truck with all the sand.

> A normal person would ring the post office first :)

And is what expect 100-continue does. You should view the heuristics in
the specs as what you do if the post office doesn't care to answer the
phone.

> So what comes out of all of this?
> 
> a) agents should use digest instead of NTLM

Yes, or a replacement along the same lines (session oriented auth at the
message level, not connection oriented)

> b) agents should notice rejections whilst they are sending (already 
> specified but not universally observed)

YEs.

> c) on rejection during an upload, agents should disconnect and retry 
> with credentials etc.  I can't really endorse using chunked uploads 
> until there is a method to signal abortive end to a transfer, and also 
> to signal size in advance if known for policy purposes.

Well, I don't consider the intermediary issues you push on that
critical. Intermediaries should follow 'b' so you know that any data
seen after a error response is to be considered garbage. It shouldn't
even be forwarded, so you should not take any actions based on that
data.

> which leaves us in the following predicament:
> 
> i) what to do about the zillions of people reliant on NTLM - seems like 
> the proposal is to basically ignore them?  Or worse still try and 
> educate them about using Digest :)

Find a better replacement for NTLM that actually works with the HTTP
specifications, then get vendors to accept it. Not technically very
hard, and takes about the same time to roll out as any other noticeable
change.

> ii) what to do about chunked uploads to fill in the gaps (reporting size 
> and signalling abortive end without closing). 

?

Signalling the abort differently than just "end of request" is purely
optional. Whoever you are talking to SHOULD have already realised the
request has failed.

> The size thing is a big problem for many sites if we move to chunked 
> uploads.  Pretty much every site has a specification for max size POST 
> data.  Having to receive it all in order to decide it's too big is 
> incredibly wasteful. 

True. And to address this one can introduce an advisory header carrying
the expected size of the transmitted entity. It has to be different from
Content-Length for protocol reasons.

> Many people around the world still have to pay for data per MB, and/or 
> are on stow connections.

The more important for them that their clients and any proxies they use
actually follow specs then.

> iii) praying that as a proxy we never get large chunked requests to 
> process for upstream HTTP/1.0 agents.

Proxies should respond with 411 in such case, making communication
revert to "HTTP/1.0 compatible" with all the problems that have..

> In many cases there is no viable 
> solution to this problem, since the proxy has to spool the entire 
> request before submitting anything to the server

No, it should not. Proxies should use the same rules on when chunked is
acceptable as any other clients, and reject to forward the message if
they think it won't work out. That's what 411 and 417 is about,
downgrading the protocol when needed. Remember that proxies act as both
servers and clients and have to fulfill both sets of requirements.

Regards
Henrik
Received on Tuesday, 8 April 2008 12:13:00 UTC