- From: Mike Bishop <Michael.Bishop@microsoft.com>
- Date: Tue, 24 Sep 2013 06:40:01 +0000
- To: Gábor Molnár <gabor.molnar@sch.bme.hu>, Roberto Peon <grmocg@gmail.com>
- CC: Tatsuhiro Tsujikawa <tatsuhiro.t@gmail.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
- Message-ID: <dbada1375ed6408389b90c87f061acb6@BY2PR03MB025.namprd03.prod.outlook.com>
I actually had written a compressor with foreknowledge to run against the data set that was sent to the list before the last meeting, specifically for the purpose of giving an "ideal" against which to measure real-world algorithms. Today I did some refactoring and added a switch that blocks it from ever doing a substitution, always sending it down the path where it didn't find a suitable entry to replace. The results suggest that the gain from being able to substitute is infinitesimal. Given perfect knowledge of future headers, here's what I see from the first SQLite file: * Substituting compressor: 36.036% of plaintext size * Append-only compressor: 36.045% of plaintext size When we're talking about ~0.01% gain in efficiency, I think there's a strong argument to be made in favor of simplicity. (Incidentally, a quick performance profile of my code shows it spends the majority of its time deciding what entry to replace!) Sent from Windows Mail From: Gábor Molnár<mailto:gabor.molnar@sch.bme.hu> Sent: ?Saturday?, ?September? ?21?, ?2013 ?4?:?23? ?PM To: Roberto Peon<mailto:grmocg@gmail.com> Cc: Tatsuhiro Tsujikawa<mailto:tatsuhiro.t@gmail.com>, ietf-http-wg@w3.org<mailto:ietf-http-wg@w3.org> It would be interesting to test if a substitution strategy is better than incremental if it knows all the upcoming headers *in advance*. This would simulate the performance of the best possible heuristic algorithm. 2013/9/21 Roberto Peon <grmocg@gmail.com<mailto:grmocg@gmail.com>> When I was doing a similar comparative analysis, I found that incremental indexing did better than substitution indexing as well. I suspect that substitution indexing benefits strongly from heuristics, e.g. compute the probability that a header is reused and use that to determine if you replace or not. That being said, I'm still unsure if the complexity of substitution indexing is worth the potential benefit. -=R On Sat, Sep 21, 2013 at 5:12 AM, Tatsuhiro Tsujikawa <tatsuhiro.t@gmail.com<mailto:tatsuhiro.t@gmail.com>> wrote: I made a simple benchmark test for substitution indexing vs incremental indexing only and share its results here. The detailed results can be found at https://github.com/tatsuhiro-t/nghttp2/wiki/hpackSubst """ HPACK draft offers 2 kind of indexing methods: incremental and substitution. In nghttp2, we always use incremental indexing. This is because we do not have good strategy to use substitution indexing efficiently. We suspect that it is in the draft because it has some use cases, but we don't see them yet. So we did some tests comparing our incremental only strategy and the experimental strategy utilizing substitution indexing. Our incremental only strategy goes as follows: 1. If the name/value pair is in the header table, use indexed representation. 2. Else, if name is in the header table, use incremental indexing with indexed name. 3. Else, use incremental indexing with new name. The experimental strategy utilizing substitution indexing changes step 2 as follows: 2. Else, if name is in the header table, substitute that entry; use substitution indexing with indexed name. We use data set in https://github.com/http2/http_samples. The detailed results are listed in the following sections. The end result is that, in the overall, incremental only strategy is more efficient than the strategy with substitution. But the difference is not so large. On some data set, the substitution performed well, so depending on the data set, the winner may change. Also there may be better strategy for substitution. """ It turns out that the experimental strategy is used in node-http2 and firefox.
Received on Tuesday, 24 September 2013 06:40:46 UTC