Re: New Version Notification for draft-vkrasnov-h2-compression-dictionaries-01.txt

> On 2 Nov 2016, at 19:55, Matthew Kerwin <matthew@kerwin.net.au> wrote:
> 
> Just chiming in without necessarily attaching to a particular thread of discussion: I'm quite probably being thick here, but isn't there a problem (of the abstraction/encapsulation flavour) with making a content-encoding dependent on values sent at the transport layer? I think I'm just reiterating what Martin was saying, but in a more vague and incoherent way.
> 
> If we're discussing compression parameters/algorithms/dictionaries/etc. at the transport layer, shouldn't the entirety of the compression happen at the transport layer? Thus making it like HTTP/2's new version of TE.
> 
> And if so, isn't transport layer compression a Bad Thing™? Because – thanks to the wonder of abstraction – the transport machinery doesn't (necessarily) know the provenance of the bytes it's compressing (thus potentially allowing sensitive and attacker-controlled data to be compressed in the same context – i.e. BREACH.)

Indeed, making a dumb compression at the transport layer is not my aim here.
A) It is indeed less safe
B) The compression benefits are much smaller when the protocol is unaware of the data type transported

Again, I am looking from the PoV of the nginx architecture that we use, and there is no clear distinction there between the layers, and introspection into the application level is easy to do. And from what I have seen in Apache, it is not that difficult either.

Certainly Server Push is a form of application/protocol level fusion.

Another approach can be client hints. Since we already have client hints in the form of priorities, we can not deny that http/2 is somewhat connected to the application level too.

> So we bounce it up the stack to the application, which has a much better chance of knowing who authored what bytes. And thus we end up back at content-encoding.

Doing it in the application level is also not as good. Because streams can get canceled, and reprioritized you are at a danger of fatal failures (such as deadlocks) if you try to control and optimize the process in the application.

> 
> If it's tied to content-encoding, it should be *entirely* contained in the semantic layer – headers and payload entities. Isn't that what SDCH is?

This is indeed not unlike SDCH and “quasi dictionaries” only your dictionary defined by a stream and not a different url. In fact it can be used with SDCH just as well.
My proposal tries to be as algorithm agnostic as possible.
However brolti compresses much better in that case. In fact from what I have seen brotli+”quasi” beats sdch+”quasi”+brotli (but maybe sdch+”quasi”+brotli+”quasi” will do even better?).

The point is you just can’t get that level of control purely at the application level or the protocol level. We should try and find a middle ground.

> 
> If it's pushed down to the transport layer, isn't it just an even less safe version of draft-kerwin-http2-encoded-data? (I said no shared compression context between different frames, this is about sharing contexts between completely different streams!)

Again, blindly compressing everything at the transport layer is not my suggestion. However even that is OK for the majority of the websites. I for one am not aware of a major effort to mitigate BREACH. 
However the methods to mitigate BREACH are valid for this proposal as well. In fact the greatest danger here is to make BREACH even faster, but isn’t the point moot when you can execute it in 30 seconds already?

> 
> I'm not entirely sure what new thing this particular proposal brings to the table.

Improved compression? 

Cheers,
Vlad

Received on Thursday, 3 November 2016 05:44:33 UTC