W3C home > Mailing lists > Public > ietf-http-wg@w3.org > July to September 2013

RE: HPACK benchmark test for substitution indexing vs incremental indexing only

From: RUELLAN Herve <Herve.Ruellan@crf.canon.fr>
Date: Fri, 27 Sep 2013 16:34:56 +0000
To: Mike Bishop <Michael.Bishop@microsoft.com>, Gábor Molnár <gabor.molnar@sch.bme.hu>, Roberto Peon <grmocg@gmail.com>
CC: Tatsuhiro Tsujikawa <tatsuhiro.t@gmail.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Message-ID: <6C71876BDCCD01488E70A2399529D5E52F4F3782@ADELE.crf.canon.fr>
I've ran some tests on the "mnot" test set and I've been able to see some effect from using substitution:
- Substituting compressor:   29.5%
- Append-only compressor: 30.3%

It's true that the effect of substitution is limited on average. However, it allows building an intelligent encoder that can take advantage of knowledge about the headers.

Hervé.

> -----Original Message-----
> From: Mike Bishop [mailto:Michael.Bishop@microsoft.com]
> Sent: mardi 24 septembre 2013 08:40
> To: Gábor Molnár; Roberto Peon
> Cc: Tatsuhiro Tsujikawa; ietf-http-wg@w3.org
> Subject: Re: HPACK benchmark test for substitution indexing vs incremental
> indexing only
> 
> I actually had written a compressor with foreknowledge to run against the
> data set that was sent to the list before the last meeting, specifically for the
> purpose of giving an “ideal” against which to measure real-world algorithms.
> 
> Today I did some refactoring and added a switch that blocks it from ever
> doing a substitution, always sending it down the path where it didn’t find a
> suitable entry to replace.  The results suggest that the gain from being able to
> substitute is infinitesimal.  Given perfect knowledge of future headers,
> here’s what I see from the first SQLite file:
> 
> *
> 	Substituting compressor:  36.036% of plaintext size
> *
> 	Append-only compressor:  36.045% of plaintext size
> 
> 
> When we’re talking about ~0.01% gain in efficiency, I think there’s a strong
> argument to be made in favor of simplicity.  (Incidentally, a quick
> performance profile of my code shows it spends the majority of its time
> deciding what entry to replace!)
> 
> Sent from Windows Mail
> 
> From: Gábor Molnár <mailto:gabor.molnar@sch.bme.hu>
> Sent: ‎Saturday‎, ‎September‎ ‎21‎, ‎2013 ‎4‎:‎23‎ ‎PM
> To: Roberto Peon <mailto:grmocg@gmail.com>
> Cc: Tatsuhiro Tsujikawa <mailto:tatsuhiro.t@gmail.com> , ietf-http-
> wg@w3.org
> 
> It would be interesting to test if a substitution strategy is better than
> incremental if it knows all the upcoming headers *in advance*. This would
> simulate the performance of the best possible heuristic algorithm.
> 
> 
> 
> 2013/9/21 Roberto Peon <grmocg@gmail.com>
> 
> 
> 	When I was doing a similar comparative analysis, I found that
> incremental indexing did better than substitution indexing as well.
> 	I suspect that substitution indexing benefits strongly from heuristics,
> e.g. compute the probability that a header is reused and use that to
> determine if you replace or not.
> 	That being said, I'm still unsure if the complexity of substitution
> indexing is worth the potential benefit.
> 
> 
> 	-=R
> 
> 
> 	On Sat, Sep 21, 2013 at 5:12 AM, Tatsuhiro Tsujikawa
> <tatsuhiro.t@gmail.com> wrote:
> 
> 
> 		I made a simple benchmark test for substitution indexing vs
> 		incremental indexing only and share its results here.
> 
> 		The detailed results can be found at
> 		https://github.com/tatsuhiro-t/nghttp2/wiki/hpackSubst

> 
> 		"""
> 		HPACK draft offers 2 kind of indexing methods: incremental
> and
> 		substitution. In nghttp2, we always use incremental
> 		indexing. This is because we do not have good strategy to
> use
> 		substitution indexing efficiently. We suspect that it is in the
> 		draft because it has some use cases, but we don't see them
> yet.
> 
> 		So we did some tests comparing our incremental only
> strategy and
> 		the experimental strategy utilizing substitution indexing.
> 
> 		Our incremental only strategy goes as follows:
> 
> 		1. If the name/value pair is in the header table, use indexed
> 		   representation.
> 
> 		2. Else, if name is in the header table, use incremental
> indexing
> 		   with indexed name.
> 
> 		3. Else, use incremental indexing with new name.
> 
> 		The experimental strategy utilizing substitution indexing
> changes
> 		step 2 as follows:
> 
> 		2. Else, if name is in the header table, substitute that entry;
> 		   use substitution indexing with indexed name.
> 
> 		We use data set in https://github.com/http2/http_samples

> <https://github.com/http2/http_samples> .
> 
> 		The detailed results are listed in the following sections.
> 
> 		The end result is that, in the overall, incremental only
> strategy
> 		is more efficient than the strategy with substitution. But the
> 		difference is not so large. On some data set, the substitution
> 		performed well, so depending on the data set, the winner
> may
> 		change. Also there may be better strategy for substitution.
> 		"""
> 
> 		It turns out that the experimental strategy is used in node-
> http2
> 		and firefox.
> 
> 
> 

Received on Friday, 27 September 2013 16:35:36 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:14:15 UTC