Re: Rechartering HTTPbis from Adrien de Croy on 2012-01-26 (ietf-http-wg@w3.org from January to March 2012)

From: Adrien de Croy <adrien@qbik.com>
Date: Fri, 27 Jan 2012 09:18:07 +1300
To: Willy Tarreau <w@1wt.eu>
CC: Poul-Henning Kamp <phk@phk.freebsd.dk>, Amos Jeffries <squid3@treenet.co.nz>, ietf-http-wg@w3.org
Message-ID: <4F21B4FF.4010608@qbik.com>
On 27/01/2012 5:06 a.m., Willy Tarreau wrote:
> On Fri, Jan 27, 2012 at 03:28:31AM +1300, Adrien de Croy wrote:
>>
>> On 27/01/2012 3:11 a.m., Willy Tarreau wrote:
>>> On Thu, Jan 26, 2012 at 02:05:56PM +0000, Poul-Henning Kamp wrote:
>>>> In message<20120126140301.GG8887@1wt.eu>, Willy Tarreau writes:
>>>>
>>>>> OK, but in the context I said that, we were talking about chunking.
>>>>> And this is still true. A chunked-encoded transfer that does not end
>>>>> with the last 0-byte chunk *is* always an indication of a truncate.
>>>> Yes, absolutely.  I just wish there were a way to end the transmission
>>>> and say: "Sorry, that went awry, but we can still use this connection.
>>> The only solution I can think of would be to send padding up to the
>>> current chunk size
>> only if you're mid-chunk when you decide that the chunk you started
>> sending you don't want to any more.
>>
>> Maybe you shouldn't have decided to send it if you weren't ready.
> That's not what I'm saying. Again, any intermediary has many reasons
> to close anywhere. Many do not even know what HTTP is nor what a
> chunk is. Chunks are not atomic. And even intermediaries which talk
> HTTP cannot all buffer all chunks. When you have a 16kB buffer per
> connection, a chunk rarely fits there so you have to transfer as you
> get them.

sure, and these types of intermediaries can continue to do the same 
thing.  If they want to be 2.0 compliant they can at least recognise 
abort signals on 0 chunks, even if they can't generate them.

But I don't see that as a reason to not add such a feature to the 
protocol.  Others can benefit.


>>> and advertise "0" with some extension to indicate
>>> the wish to reuse the connection. But I really think that the cost of
>>> handling all impacts of a failed connections sensibly offsets the small
>>> expected gain for these rare conditions.
>> Sure the number of reset connections is very small, and you can't
>> advertise anything anyway since you can't send any more.
>>
>> This is only useful when you want to abort, and you can do so on any
>> chunk boundary.  Those that wish to use this can tweak their code to
>> make it effective, those that don't or just want to pass stuff through,
>> that's fine.
>>
>> But checking for attributes on the final 0 chunk seems to me to be a
>> cheap way to get the benefit of abort when you want it.
>>
>> I'd suggest the number of aborted sends due to content would outweigh
>> network errors.
> It depends on the environment. In your products since you have valid
> reasons to abort when matching contents, that's certainly true. But I
> know many other infrastructures where no such filtering happens, and
> the primary reason for an abort is a timeout, the second one is the
> usual process crash in the middle of a processing due to an application
> bug.

This comes back to my other point about who is interested (apart from 
all my customers) in scanning at an intermediary.

>
>> But anyway, it's a basic principle, if you make a decision that affects
>> another party, you should communicate it.  If you can't you can't, but
>> you shouldn't say "Because I can't ALWAYS communicate it, I will choose
>> instead to NEVER do it".  That's sociopathic :)
> I agree with this point of view. As I said, what I'm against is making
> it harder to support the normal case just to favor better error recovery
> for the fatal cases.
the abort signal isn't just so you can re-use the connection, although 
that is an added benefit, in our case the primary benefit is to be able 
to signal the receiver that the entity should be abandoned for some 
reason other than a transient network failure.  E.g. don't try again.


> For instance, re-opening a connection is cheap. OK
> it's a round-trip, but if it happens less that 1/10000 times it's probably
> better than padding megabytes of chunks or making parsers more complex.

I don't think you'd need to pad chunks.  Any intermediary that is 
scanning will be passing data through some filtering layers.

It's inconceivable that such data would not be de-chunked prior to being 
passed through the filters (else all filters would need to handle chunks).

Therefore it's likely the data would need to be re-chunked at the other 
end of the filter chain.

Adrien

>
> Willy
>

-- 
Adrien de Croy - WinGate Proxy Server - http://www.wingate.com
Received on Thursday, 26 January 2012 20:20:04 UTC