- From: Adrien de Croy <adrien@qbik.com>
- Date: Tue, 08 Apr 2008 22:41:06 +1200
- To: Henrik Nordstrom <henrik@henriknordstrom.net>
- CC: HTTP Working Group <ietf-http-wg@w3.org>
Henrik Nordstrom wrote:
> tis 2008-04-08 klockan 13:44 +1200 skrev Adrien de Croy:
>
>
>> Seems to me that getting around message - length / connection
>> maintaining issues by using chunking so you can send a premature final
>> chunk instead of terminating the connection is asking for a whole lot of
>> pain. It's an ugly hack.
>>
>
> I disagree.
>
>
>> For starters, there's then no way to tell the recipient that the
>> completion was an abortive one.
>>
>
> It's the recipient who asked for it to be aborted by sending an error
> instead of 100 continue.
>
I guess that's my point. If you look at many proxies or servers, where
you have installable filter modules etc, then a lot of processes can
exist inside this one "recipient who asked for it". They may not
necessarily know that a module upstream of them rejected the message.
I think it's necessary if we are to try and keep connections alive, to
be able to signal an abortive end to message body transfer without
closing the connection. A chunk extension for this as you propose
sounds like a good way to do this.
>> If there are several processes in an
>> intermediary or end server that the data goes through before getting to
>> the module that caused the client to abort, then you've got all manner
>> of things may happen to that data which appears complete.
>>
>
>
>> That's where it would have been useful to support notifying an abortive
>> end ( e.g. previously discussed negative 1 chunk length ) to a
>> transfer. Abort without closing.
>>
>
> Then propose a chunk extension for that purpose.. I.e. something like
>
> 0; aborted
>
>
>> All in all I think using chunked uploads is bad for many reasons, apart
>> from the fact that it's probably poorly supported even in existing
>> HTTP/1.1 infrastructure. Using it so you can stop sending a body is
>> even worse.
>>
>
> I obviously don't share that opinion.
>
>
evidently :)
Actually I think there are some places where chunked uploads are
appropriate, but my personal opinion is that if at all possible, the
length should be specified in Content-Length. Unless a client is
streaming some on-the-fly generated content up in an upload (that
couldn't be spooled first), then it should always send a length. Just
like I think in SMTP all sending agents should indicate a SIZE in the
MAIL FROM command. It's just the most efficient way to implement
size-based policy.
>> As I said, it's possible ("a client can") to have a connection followed
>> by initiation of a large transfer without any notification of acceptance
>> by the thing that will have to swallow the data or reject and disconnect.
>>
>
> Yes.
>
> Same is also true for request headers, or URL lengths..
>
sure, but not many serves would accept request headers + URL length over
about 32kb. I think the default for IIS is 16 kb.
I'd be surprised if any accept even 1MB. It doesn't compare to a large
upload.
>> If you want proof of the problem, try uploading a 100MB file across a
>> slow WAN through a proxy that requires auth to a poorly-connected server
>> that requires auth (a fairly common scenario actually). Basically
>> impossible because depending on the auth method, that 100MB may have to
>> be sent up to 6 times.
>>
>
> Well, ignoring connection oriented auth the client SHOULD either use
> chunked encoding or close the connection, aborting the request when
> seeing the challenge.
right - but then that only works with HTTP Basic auth and Digest. NTLM
users are out in the cold because their auth method requires a
persistent connection. It's a lot easier to write "ignoring connection
oriented" than it is to ignore all those customers.
It doesn't help either that many browsers don't support receiving any
data whilst trying to send resource, so won't stop sending even if you
do send back a 4xx. I gues contravening your spec reference below. I
understand the market-leading browser does this.
> Clients SHOULD NOT send the whole body after
> receiving an error. This is already spelled out verbatim in the specs
> under "Monitoring Connections for Error Status Messages".
>
> This got screwed up the day someone decided that connection oriented
> authentication is a good thing in a message oriented protocol where each
> message is supposed to be self-contained... I won't even comment on what
> disaster that is to the process.
I guess that's what stands HTTP apart from pretty much all other
transfer protocols (SMTP, FTP, NNTP etc). By optimising the protocol so
much to minimise round-trips, we sacrificed certain protocol
pleasantries that dealt with these sorts of situations. When we then
come up against the issues, we have to specify heuristics to get around
it, and to deal with backward compatibility. It makes me wonder what
HTTP will look like in 30 years.
> The use of chunked encoding helps
> avoiding some of the disaster thankfully.
>
>
>> Try an analogy - e.g. a fertiliser delivery company.
>>
>> the HTTP way:
>>
>> * The truck turns up and starts dumping fertiliser in your driveway
>> * you come out and scream at it.
>> * it stops
>> * you clean up
>> * maybe it comes back and dumps some more fertiliser in your driveway
>> soon after
>>
>
> I see where you are coming, but you got the roles the wrong way around
> for it to match HTTP. A more proper analogy would be a person trying to
> place an order, or someone wanting to send a package at the post office.
>
> Example based on the order. Note that there is a twist here in that if
> you hand over the order form and there is a problem you can't get it
> back and have to fill out a new copy.
>
>
I guess I was trying to convey the issue of the work involved and
resources expended by all parties (therefore waste) in even fronting up
with the request.
I like the sending the package at the post office analogy, but try it
with 2000kg of sand, and you live on an island and see what the
logistics are like. Esp when the small local post office wants to see
some id, which you have to go back home for, and then front up with
another 2000kg of sand when you come back again with your ID, then they
say they don't accept packages that big.
Or worse still then they refer you to the central post office on the
mainland, which is only accessible by a small rope swing bridge which
you have to carry this 2000kg over in small bags. Then they send you
home for ID as well.
And even with Digest auth and 2 entities wanting auth, you are looking
to transfer 4 times or drop connections. I guess that's the beauty of
digest - you don't need to maintain the connection.
But this really doesn't help so many people who are stuck with NTLM for
any number of reasons.
And 100 continue is just like turning up to the post office with this
sand on the truck with the removal guys you are paying $40/hr for, and
if someone doesn't come out of the post office within 2 seconds, you
start unloading. And once you start unloading, you don't stop until
it's unloaded (what many browsers do). 100 Continue + Expects is the
same except you toot the horn outside before waiting for 2 seconds then
starting unloading.
A normal person would ring the post office first :)
So what comes out of all of this?
a) agents should use digest instead of NTLM
b) agents should notice rejections whilst they are sending (already
specified but not universally observed)
c) on rejection during an upload, agents should disconnect and retry
with credentials etc. I can't really endorse using chunked uploads
until there is a method to signal abortive end to a transfer, and also
to signal size in advance if known for policy purposes.
which leaves us in the following predicament:
i) what to do about the zillions of people reliant on NTLM - seems like
the proposal is to basically ignore them? Or worse still try and
educate them about using Digest :)
ii) what to do about chunked uploads to fill in the gaps (reporting size
and signalling abortive end without closing).
The size thing is a big problem for many sites if we move to chunked
uploads. Pretty much every site has a specification for max size POST
data. Having to receive it all in order to decide it's too big is
incredibly wasteful.
Many people around the world still have to pay for data per MB, and/or
are on slow connections.
iii) praying that as a proxy we never get large chunked requests to
process for upstream HTTP/1.0 agents. In many cases there is no viable
solution to this problem, since the proxy has to spool the entire
request before submitting anything to the server - it can't send
anything through without a Content-Length. Actually for this reason and
ii) above, I'd be a keen fan of using Content-Length AND chunking for
client message bodies (even though it's utterly prohibited in the
spec). At least then the proxy could stream stuff through to the HTTP
1.0 next hop as it received it (after of course first establishing that
the next hop actually IS HTTP/1.0 by having its first request bounced).
Regards
Adrien
> - Person fills out his order form and goes to the counter where the
> order is supposed to be placed.
> - If the customer is polite he asks if he may place the order before
> handing over the order form, or if in a hurry he hands over the order
> form immediately hoping for the best.
> - Gets told that he need to have some proof of his customer number in
> order to place the order.
> - Goes away to fetch his customer card proving his identity in the
> process.
> - Back at the counter again repeating the process
>
> Or another version where everything is in order
>
> - Person fills out his order form and goes to the counter where the
> order is supposed to be placed.
> - The customer is polite and asks if he may place the order before
> handing over the order form
> - The processing agent is currently busy processing another order and
> doesn't answer immediately.
> - The person hands over the order form when the processing agent says
> it's ready, or if he suspects the processing agent or the communication
> with it is oldfashioned and outdated and doesn't indicate readiness.
>
> Regards
> Henrik
>
>
>
--
Adrien de Croy - WinGate Proxy Server - http://www.wingate.com
Received on Tuesday, 8 April 2008 10:40:21 UTC