- From: James M Snell <jasnell@gmail.com>
- Date: Wed, 3 Oct 2012 14:49:19 -0700
- To: Roberto Peon <grmocg@gmail.com>
- Cc: Amos Jeffries <squid3@treenet.co.nz>, ietf-http-wg@w3.org
- Message-ID: <CABP7RbeGmyYy7rvVhjOymrcUFhns5Nf4D9ahyP0-NqFcZiEDew@mail.gmail.com>
On Wed, Oct 3, 2012 at 2:42 PM, Roberto Peon <grmocg@gmail.com> wrote: > CRIME works by observing the size of the resultant packet stream. > Thus, if the plaintext is ever compressed within the same stream context > as user-controlled plaintext, then the can learn something about what is > going on, regardless of output salting, encryption, etc. > > Ok, got it.. > With the compression that I'm proposing, you only learn something when > you've guessed the entire plaintext for that field, verbatim, at which > point you're just as well off by sending the data to the server directly. > I'll be writing it up shortly. > > Will definitely be looking forward to seeing that. I'd like to explore whether the new mechanism is going to be efficiently compatible with bohe-like tokenization to see if it still makes sense to head down that path. - James > -=R > > On Wed, Oct 3, 2012 at 1:19 PM, James M Snell <jasnell@gmail.com> wrote: > >> >> >> On Wed, Oct 3, 2012 at 12:15 AM, Roberto Peon <grmocg@gmail.com> wrote: >> >>> >>> [snip] >>> Yep-- what I've been doing is whole-key or whole-value delta-encoding >>> with static huffman coding, with an LRU of key-value pairs. A set of >>> headers is thus simply a set of references to the items in the LRU. >>> The set of operations is: >>> add a new hey-value line into the LRU by specifying a new key-value >>> this looks like: {opcode: KVStore, string key, string val}. >>> add a new key-value line into the LRU by referencing a previous >>> key-value, copying the key from it and adding the specified new value >>> this looks like: {opcode: Mutate,int lru_index, string val}. >>> toggle visibility for a particular LRU entry for a particular header >>> set >>> this looks like: {opcode: Toggle,int lru_index}. >>> toggle visibility for a contiguous range of LRU entries for a >>> particular header set >>> this looks like: {opcode: Toggle,int lru_index_start, int >>> lru_index_end}. >>> >>> Note that the actual format of the operations isn't exactly like what >>> I'm describing above- I'm just trying to indicate generally what is >>> involved. >>> >>> >> It would definitely be helpful to have descriptive write up on this, >> perhaps submitted as an I-D, that we can review. >> >> Putting aside, for a moment, the contentious and controversial history of >> discussions around websocket... could we not address the CRIME issue by >> randomly salting and masking individual frames within the stream? Yes, >> there is an obvious negative impact to deflate encoding, but if we utilize >> tokenization (ala my bohe draft) then we would achieve a significant level >> of compression naturally through the encoding. I have not yet fully tested >> it, but the combination of that, the randomized salting, and the tls >> encryption should be not be subject to CRIME type attacks. Just a thought. >> >> - James >> >> >>> The resulting compression is a bit worse than gzip (with large window >>> size) on my current test corpus, but compares pretty well with gzip in the >>> Chrome implementation of SPDY. >>> It has CPU advantages in that the huffman encoding is static, thus for >>> proxies there is no re-encoding necessary. Additionally, much or all of the >>> decompressor state can be shared with a compressor (if proxying, for >>> instance). >>> Finally, I expect (though I've yet to prove it yet, as I'm still doing >>> the c++ implementation) that the compression is more CPU efficient than >>> gzip. Decompression should be similar... but.. much of the time you need >>> not reconstitute an entire set of headers-- instead, since we're sending >>> deltas anyway, you simply ammend your state based on what changed and thus >>> become more efficient there as well. >>> >>> If clients/servers were a bit more naive in terms of when they >>> added/removed headers, the delta-coding would be more efficient and it'd >>> approach or exceed gzip compression.. at least I think so :) >>> As far as I (or thusfar anyone with whom I've spoken) can tell, the >>> approach here does not allow probing of the compression context, and is >>> thus robust in the face of known attacks. >>> >>> Anyway, that is what I've been working on. >>> -=R >>> >>> >>> >>>> >>>> >>>> >>>> Following that, I suspect it'll be most useful to work on the upgrade >>>>> mechanism (which will also help with #1 above). Patrick sent out what >>>>> I think most people agree is a good starting point for that discussion >>>>> here: <http://www.w3.org/mid/**1345470312.2877.55.camel@ds9<http://www.w3.org/mid/1345470312.2877.55.camel@ds9> >>>>> >. >>>>> >>>>> We'll start these discussions soon, using the Atlanta meeting as a >>>>> checkpoint for the work. If its' going well by then (i.e., we have a >>>>> good set of issues and some healthy discussion, ideally with some data >>>>> starting to emerge), I'd expect us to schedule an interim meeting >>>>> sometime early next year, to have more substantial discussion. >>>>> >>>>> More details to follow. Thanks to everybody for helping get us this >>>>> far, as well as to Martin, Alexey and Julian for volunteering their >>>>> time. >>>>> >>>>> Regards, >>>>> >>>>> -- >>>>> Mark Nottingham >>>>> http://www.mnot.net/ >>>>> >>>> >>>> >>>> AYJ >>>> >>>> >>>> >>> >> >
Received on Wednesday, 3 October 2012 21:50:08 UTC