Re: Making Implicit C-E work. from Amos Jeffries on 2014-04-30 (ietf-http-wg@w3.org from April to June 2014)

From: Amos Jeffries <squid3@treenet.co.nz>
Date: Thu, 01 May 2014 01:45:53 +1200
To: ietf-http-wg@w3.org
Message-ID: <5360FE91.8020704@treenet.co.nz>
On 30/04/2014 10:26 p.m., Roberto Peon wrote:
> On Wed, Apr 30, 2014 at 2:35 AM, Roland Zink <roland@zinks.de> wrote:
> 
>>  On 30.04.2014 08:44, Roberto Peon wrote:
>>
>>>>> As I described it, its use by the originator of an entity is not
>> mandated, instead behaviors are mandated of recipients when it IS used.
>>
>>>  >>>
>>>>> 
>>>>> Yeah, mandating it. Which I'm not happy about.
>>>>
>>>> Mandates support, not use.
>>>>
>>>
>>> Kind of the same thing, from the client's POV. Server's choice.
>>>
>>
>>    For C-E this means for example the server decides if the client can do
>> a seek. Some interactive clients would prefer to do seeks over getting the
>> content compressed. Whereas downloaders would prefer the content to be
>> compressed. T-E would allow to have both seek and compression.
>>
> 
> The server *always* decides what to send, whether c-e or t-e is used. The
> fact that the server *may* use gzip does not *require* it to use gzip, and
> with my proposal, the server knows if the client requested it explicitly or
> not, and it certainly can see if there is a range request and make the
> appropriate response.
> 
> T-E is theoretically wonderful if one ignores real deployments in today's
> world where the majority of HTTP/1.X servers don't actually do
> transfer-encoding: gzip, and thus HTTP2 gateways would have to do c-e to
> t-e translation (which might be rather error prone in its own way) or have
> to bear the expense of doing the compression themselves-- something which
> is untenable. This ignores the the security issue of knowing when t-e is
> safe, which I'll address again below.
> 
> 
> 
>>
>>   And today it is often neither the server nor the client's choice, which
>> is what is causing the pain. The client expresses that it wants gzip. The
>> intermediary doesn't do it because it makes numbers better, increases
>> throughput, or because they're too lazy to implement it., all at the cost
>> of the decreased user experience.
>>
>>
>>> <snip>
>>>>
>>>> The combination of intermediaries stripping a-e plus the competitive
>>> driver to deliver good experience/latency is causing interop failures today
>>> where servers will send gzip'd data whether or not the client declares
>>> support in a-e.
>>>>
>>>
>>> Wait, you're saying the whole motivator here is that servers don't comply
>>> with the protocol? So you're changing the protocol to accommodate them?
>>> That does not feel right to me, at all; it's not just blessing a potential
>>> misuse of C-E, it's wallpapering over a flat out abuse.
>>>
>> Partially.
>> I'm saying that intermediaries are doing things which are incenting
>> implementors to break compatibility with the spec, and that implementors
>> are doing so because it makes the users happy.
>> In the end, making the users happy is what matters, both commercially and
>> privately. The users really don't care about purity, and will migrate to
>> implementations that give them good/better user experience.
>>
>>  But even so, why do you have to fix it in HTTP/2? And why does it hurt
>>> h2 to *not* fix it?
>>>
>>
>>  Compression is an important part of making latency decrease/performance
>> increase, and, frankly, there is little practical motivation to deploy
>> HTTP/2 if it doesn't succeed in reducing latency/increase performance.
>> Success isn't (or shouldn't be) defined as completing a protocol spec, but
>> rather, getting an interoperable protocol deployed. If it doesn't get
>> deployed, the effort is wasted. If it doesn't solve real problems, the
>> effort is wasted.
>>
>>  In any case, I cannot reliably deploy a T-e based compression solution.
>> T-e based compression costs too much CPU, especially as compared with c-e
>> where one simply compresses any static entity once and decompresses (which
>> is cheap) as necessary at the gateway.
>>
>> If it is really T-E you can do the same compression of static entities
>> when the whole file is delivered, it would be different for range requests
>> or the frame based approach.
>>
> 
> Now how we've thusfar spec'd it.
> 
> 
>>   T-e based compression isn't as performant in terms of
>> compression/deflation ratios.
>>
>> Don't think this is true, the same bytes can be sent as either T-E or C-E.
>> For the frame based approach some numbers were given.
>>
> 
> The same bytes can't be sent in both, unless the we're willing to suffer
> vastly increased DoS surface area and memory usage OR we do the frame-based
> approach, which will have marginally worse compression.
> 
>>   Many deployed clients/servers wouldn't correctly support it.
>>
>> There are no deployed HTTP2 clients or servers, or are there some?
>>
> 
> There are, but I'm not talking about those. My problem is dealing with the
> rest of the world, which is mostly HTTP/1.X and is unlikely to rapidly
> change.
> In other words, I'm concerned mainly with HTTP/1.X clients and especially
> servers.
> 
>>   T-e would require that any gateway acting as a loadbalancer/reverse
>> proxy would either need to know which resources it could compress,  or
>> forces us to not use compression.
>>
>> The gateway can forward the compressed content unmodified. The gateway is
>> only forced to do something if either the server or the client doesn't
>> support compression.
>>
> 
> The gateway cannot know which resources it is safe to compress without
> something outside the protocol. Compression via t-e without knowing whether
> it is safe or not allows attackers to discern ostensibly secret
> information. This is NOT acceptable.
> 
>>   Knowing what resources to compress either requires an oracle, or
>> requires content authors to change how they author content (*really* not
>> likely to happen),
>>
>>    Not sure that authors want to know about compression. If it is
>> automatic then this would be fine. Currently there is server configuration,
>> for example zlib.output_compression in php.ini, and the possibility to do
>> this in the content, for example in PHP something like
>> ob_start('ob_gzhandler'). I guess there is a lot more authors are not aware
>> off.
>>
> 
> We definitely don't want to cause content authors to lose what little
> control (and understanding) they have today, especially over matters
> touching security like compression.
> In general, if a resource wasn't compressed on output from an endpoint, it
> shouldn't be when received by any other endpoint.

This seems wrong. The general case is a resource not compressed when
received by an endpoint it should not be compressed when leaving that
*same* endpoint.
Which I understand is what the proposals about C-E:gzip are saying
gateways should do:
  Accept implicit gzip within HTTP/2 so servers can emit it and
decompress for identity-only representations as soon as they get to any
HTTP/1 hop. The 1.1->2.0 transitions should obey the HTTP/1 senders use
of T-E:gzip (if they attempt it) or retain identity on the new HTTP/2 hop.

Essentially, resources start out compressed but anyone can decompress
and it stays uncompressed for the remainder of the journey.


I see no problem with a clause in the HTTP/2 spec regarding *T-E*
mandating that T-E:gzip can only be removed, never added.

This whole implicit C-E smells like an attempt to rename HTTP/1 T-E:gzip
as HTTP/2 C-E:gzip and lump all the resulting deployment problems on the
gateway implementers shoulders.


> Given the necessity of interfacing with HTTP/1 servers, which rarely
> support T-E: gzip, this ends up being a problem for HTTP/2 and T-e: gzip.
> 

No problem there. The HTTP/2 gateway already has mandatory
(de)compression support in all of these proposals and the existing specs
text.

Sending traffic received with T-E:gzip into HTTP/1 is a simple
decompression.
Receiving traffic from HTTP/1 does not require any compression unless
the HTTP/1 endpoint *does* support T-E:gzip, in which case it is
optional to do anything for HTTP/2.

==> I would like to point out again that the security worries for T-E
*do not exist* unless the HTTP/2 hop is *adding* compression on its own.


Amos
Received on Wednesday, 30 April 2014 13:46:24 UTC