- From: James M Snell <jasnell@gmail.com>
- Date: Tue, 19 Mar 2013 17:27:56 -0700
- To: "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>, Roberto Peon <grmocg@gmail.com>
- Message-ID: <CABP7Rbfs9wRqs3ebyPvnpiyW5-jsDxJjyS-D1UOixeRpAPEvtQ@mail.gmail.com>
Just finally getting a chance to read through the updated draft.. this is definitely looking good so far. I'll likely have additional feedback later on but two items definitely caught by eye... Within the compression algorithm, you show the call to ComputeTrangsFromRawToggls but it does not appear that you describe what exactly is going on in that step beyond a single comment. You should likely expand that to say that the function reduces multiple numerically adjacent toggls into trangs in an *attempt* to reduce the actual number of bits serialized on the wire and simplify processing. I say "attempt" because there is a notable (albeit minor in the big picture) exception and optimization that is also worth mentioning -- Suppose you have N adjacent toggls with a total serialized length of len(N) bytes. Let's also say that those toggls can be represented by M trang operations with a serialized length len(M), if len(N) <= len(M), the ComputeTrangsFromRawToggls procedure can and should be skipped. A quick pseudocode comment ought to be enough to call this out. The second item has to do with the reindexing of the LRU at the end of the ParseAndExecuteHeaderBlock procedure. It would be good to explain in more detail what exactly is happening there and why, especially given that it does have a fairly significant impact on overall performance. As we've discussed offline, I've been experimenting with a couple of alternative ways of handling the reindexing. I have yet to settle on any one particular method but the one I've played with the most uses two internal tables, one holding the key value pairs themselves and a second that maps external id's to the internal storage keys.. that is, Pair<K,V>[] store; Integer[] index; Where each value i in index is a pointer to a location in store. After each compression/decompression, the order of index is recalculated by sorting the keys of store by frequency of use over the lifetime of the store, dropping all the items whose frequency falls below a given threshold. In my current implementation, the sorting is currently done with a simple efficient quicksort. Since most of the index items will already be sorted, this runs very quickly and does not require us to touch the actual key value pairs at all. With proper optimization, we ought to be able to implement it so that len(index) is significantly less than len(store). When reindexing is complete, the changed items are noted and the header group is updated accordingly. Obviously there's a tradeoff here in performance vs. used storage space. In your current scheme, it will be common for key-value pairs to be duplicated within the store -- that is, once ParseAndExecuteHeaderBlock is run, all of the current Key-Values that are used are re-added to the store, regardless of whether they already exist. Any already existing items do not disappear until they happen to drop off the end. In my revised scheme, stored items in the main lru cache are not duplicated, they just get reindexed with a new external ID. Now, I'm a long way off from settling on any particular scheme, and I'll readily admit that there are issues with the method I describe here, so for now just take this all as fodder for discussion. Anyway, that's it for now. More feedback likely next week once I free up some time to continue working on the java implementation. - James
Received on Wednesday, 20 March 2013 00:28:47 UTC