Re: Making Implicit C-E work. from Roland Zink on 2014-04-30 (ietf-http-wg@w3.org from April to June 2014)

From: Roland Zink <roland@zinks.de>
Date: Wed, 30 Apr 2014 13:45:42 +0200
To: ietf-http-wg@w3.org
Message-ID: <5360E266.4010500@zinks.de>
On 30.04.2014 12:26, Roberto Peon wrote:
>
>
>
> On Wed, Apr 30, 2014 at 2:35 AM, Roland Zink <roland@zinks.de 
> <mailto:roland@zinks.de>> wrote:
>
>     On 30.04.2014 08:44, Roberto Peon wrote:
>>     >>> As I described it, its use by the originator of an entity is
>>     not mandated, instead behaviors are mandated of recipients when
>>     it IS used.
>>
>>         >>>
>>         >>
>>         >> Yeah, mandating it. Which I'm not happy about.
>>         >
>>         > Mandates support, not use.
>>         >
>>
>>         Kind of the same thing, from the client's POV. Server's choice.
>>
>>
>     For C-E this means for example the server decides if the client
>     can do a seek. Some interactive clients would prefer to do seeks
>     over getting the content compressed. Whereas downloaders would
>     prefer the content to be compressed. T-E would allow to have both
>     seek and compression.
>
>
> The server *always* decides what to send, whether c-e or t-e is used. 
> The fact that the server *may* use gzip does not *require* it to use 
> gzip, and with my proposal, the server knows if the client requested 
> it explicitly or not, and it certainly can see if there is a range 
> request and make the appropriate response.
>
So when the server receives a range request then it decides based on the 
a-e if the range is for the compressed or the uncompressed content? If 
the a-e includes gzip it uses part of the compressed content and 
otherwise it returns some part of the uncompressed content and doesn't 
compress the answer?

> T-E is theoretically wonderful if one ignores real deployments in 
> today's world where the majority of HTTP/1.X servers don't actually do 
> transfer-encoding: gzip, and thus HTTP2 gateways would have to do c-e 
> to t-e translation (which might be rather error prone in its own way) 
> or have to bear the expense of doing the compression themselves-- 
> something which is untenable. This ignores the the security issue of 
> knowing when t-e is safe, which I'll address again below.
>
I agree. To avoid the c-e to t-e translation the gateway should do the 
compression. Aren't HTTP2 clients expected to detect if the server is 
HTTP/2 or HTTP/1.1 and do the appropriate thing? For clients only 
talking through a gateway it should be possible to do the compression at 
the gateway.
>
>
>>     And today it is often neither the server nor the client's choice,
>>     which is what is causing the pain. The client expresses that it
>>     wants gzip. The intermediary doesn't do it because it makes
>>     numbers better, increases throughput, or because they're too lazy
>>     to implement it., all at the cost of the decreased user experience.
>>
>>         <snip>
>>         >
>>         > The combination of intermediaries stripping a-e plus the
>>         competitive driver to deliver good experience/latency is
>>         causing interop failures today where servers will send gzip'd
>>         data whether or not the client declares support in a-e.
>>         >
>>
>>         Wait, you're saying the whole motivator here is that servers
>>         don't comply with the protocol? So you're changing the
>>         protocol to accommodate them? That does not feel right to me,
>>         at all; it's not just blessing a potential misuse of C-E,
>>         it's wallpapering over a flat out abuse.
>>
>>     Partially.
>>     I'm saying that intermediaries are doing things which are
>>     incenting implementors to break compatibility with the spec, and
>>     that implementors are doing so because it makes the users happy.
>>     In the end, making the users happy is what matters, both
>>     commercially and privately. The users really don't care about
>>     purity, and will migrate to implementations that give them
>>     good/better user experience.
>>
>>         But even so, why do you have to fix it in HTTP/2? And why
>>         does it hurt h2 to *not* fix it?
>>
>>
>>     Compression is an important part of making latency
>>     decrease/performance increase, and, frankly, there is little
>>     practical motivation to deploy HTTP/2 if it doesn't succeed in
>>     reducing latency/increase performance.
>>     Success isn't (or shouldn't be) defined as completing a protocol
>>     spec, but rather, getting an interoperable protocol deployed. If
>>     it doesn't get deployed, the effort is wasted. If it doesn't
>>     solve real problems, the effort is wasted.
>>
>>     In any case, I cannot reliably deploy a T-e based compression
>>     solution.
>>     T-e based compression costs too much CPU, especially as compared
>>     with c-e where one simply compresses any static entity once and
>>     decompresses (which is cheap) as necessary at the gateway.
>     If it is really T-E you can do the same compression of static
>     entities when the whole file is delivered, it would be different
>     for range requests or the frame based approach.
>
>
> Now how we've thusfar spec'd it.
Even then the server could pre-compute the compression for the frames it 
want to send and send always the same, although you probably need a new 
tool to do so.

>>     T-e based compression isn't as performant in terms of
>>     compression/deflation ratios.
>     Don't think this is true, the same bytes can be sent as either T-E
>     or C-E. For the frame based approach some numbers were given.
>
>
> The same bytes can't be sent in both, unless the we're willing to 
> suffer vastly increased DoS surface area and memory usage OR we do the 
> frame-based approach, which will have marginally worse compression.
>
>>     Many deployed clients/servers wouldn't correctly support it.
>     There are no deployed HTTP2 clients or servers, or are there some?
>
>
> There are, but I'm not talking about those. My problem is dealing with 
> the rest of the world, which is mostly HTTP/1.X and is unlikely to 
> rapidly change.
> In other words, I'm concerned mainly with HTTP/1.X clients and 
> especially servers.
>
>>     T-e would require that any gateway acting as a
>>     loadbalancer/reverse proxy would either need to know which
>>     resources it could compress,  or forces us to not use compression.
>     The gateway can forward the compressed content unmodified. The
>     gateway is only forced to do something if either the server or the
>     client doesn't support compression.
>
>
> The gateway cannot know which resources it is safe to compress without 
> something outside the protocol. Compression via t-e without knowing 
> whether it is safe or not allows attackers to discern ostensibly 
> secret information. This is NOT acceptable.
Is this still true when there is no compression context shared between 
frames and therefore between different sources? If a proxy is just 
forwarding what it gets I can't see a difference between C-E and T-E.

>>     Knowing what resources to compress either requires an oracle, or
>>     requires content authors to change how they author content
>>     (*really* not likely to happen),
>>
>     Not sure that authors want to know about compression. If it is
>     automatic then this would be fine. Currently there is server
>     configuration, for example zlib.output_compression in php.ini, and
>     the possibility to do this in the content, for example in PHP
>     something like ob_start('ob_gzhandler'). I guess there is a lot
>     more authors are not aware off.
>
>
> We definitely don't want to cause content authors to lose what little 
> control (and understanding) they have today, especially over matters 
> touching security like compression.
> In general, if a resource wasn't compressed on output from an 
> endpoint, it shouldn't be when received by any other endpoint.
> Given the necessity of interfacing with HTTP/1 servers, which rarely 
> support T-E: gzip, this ends up being a problem for HTTP/2 and T-e: gzip.
>
>>         <snip>
>>
>>         > The proxy, when forwarding the server's response to the
>>         HTTP/1 client, must ensure that the data is uncompressed when
>>         forwarding to the HTTP/1 client since the client didn't ask
>>         for c-e gzip.
>>         >
>>
>>         Cache-Control:no-transform explicitly forbids the proxy from
>>         altering the representation. It's not allowed to decompress it.
>>
>>     In fact what we're doing is offering two representations
>>     simultaneously.
>     Don't think this will really work. Only one can delivered and
>     translating between them (for example range) seems to be difficult.
>
>
> Why is range a problem? If a server wishes to service a range request, 
> it need not compress the output.
So turn off compression just in case the client will do a range request? 
Range requests are a problem as with C-E gzip the range is for the 
compressed content and without it is for the uncompressed content. The 
server would need to know for which representation the request is in 
order to decide what to do.

> With the proposal, the server will be able to know if the client 
> requested gzip explicitly or not and make the correct decision w.r.t. 
> what to output. This is the same as it is today with HTTP/1.
You mean when it is not explicitly requested it will not send gzip content?
> -=R
>
Received on Wednesday, 30 April 2014 11:46:08 UTC