- From: Roberto Peon <grmocg@gmail.com>
- Date: Sat, 24 Aug 2013 03:06:04 -0700
- To: Tatsuhiro Tsujikawa <tatsuhiro.t@gmail.com>
- Cc: "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
- Message-ID: <CAP+FsNeiwOi8c6_uRLQF_HDo4ROsY13qQ+DMjNeHw5aMTz3arA@mail.gmail.com>
Correct, which is why emitting references first is an easy optimization :) You should take a look at the pseudo-code here: http://tools.ietf.org/html/draft-rpeon-httpbis-header-compression-03#section-10 This is for the delta2 encoding scheme, which is a little different, but the approach to dealing with this issue is in there, and is not so bad. The question of on-the-wire-size is a fun one. I believe that what is currently in the draft will result in smaller on-the-wire sized stuff because most of the time the things you'd reference should not be expired from the state (given the analysis of distance-to-referenced-index I posted some time back). I suspect the trickiest bit will be dealing with any required-ordering for certain headers which might require it (and which will require doing fun things with references). -=R On Sat, Aug 24, 2013 at 12:07 AM, Tatsuhiro Tsujikawa <tatsuhiro.t@gmail.com > wrote: > > > > On Sat, Aug 24, 2013 at 6:06 AM, Roberto Peon <grmocg@gmail.com> wrote: > >> Any removal from the state set requires that anything that pointed to it >> be removed (else you'd segv or equivalent). >> Thus, substitution or expiry always requires the corresponding >> reference-set entry to be removed. >> >> > Thank you for clarifying that. I submitted the issue for this in the > github. > > >> Your sentence: "But to >> handle common header gracefully with eviction, when the entry in >> the header table is removed from the header table due to the >> eviction or substitution, if the entry is in the reference set >> and it is not emitted in the current header processing, emit the >> entry on the removal." >> >> is thus partially correct. >> >> The entry should be removed, but not emitted-- the draft currently >> specifies emitting things only when: >> >> - The entry is indexed, and is not present in the reference set >> - A new entry is added >> - The entry is in the reference set after all operations have been >> processed AND it hasn't been emitted. >> >> >> > The problem here is that the we have to track the common headers removal > in anyway (either decoder or encoder). > > For example, if we have header table like this: > > #0 alpha, bravo > #1 charlie, delta, > #2 ... > and so on > > And #0 is in the reference set. > > Now encoder starts encoding the following header set: > > alpha, bravo > echo, foxtrot > > If the name/value pairs in header set is processed this order, > alpha,bravo is in the reference set, so it is "common header" and nothing > encoded. Next, encoder somehow decided to encode echo,foxtrot as literal > and > added to the header table but it turned out that removes alpha,bravo from > the > header table. > As a result, the header block only includes echo,foxtrot as literal block. > If decoder does not emit the alpha,bravo on the removal, it will only emit > echo,foxtrot. > But if emission on the removal is not the intention of the draft, we can > do the > similar thing in the encoder side. Instead of emitting common header on > the removal on the decoder side, encode common header on removal on > the encoder side (which brings back to the entry to the header table and > reference set). The downside is the bytes on the wire will be potentially > increased because we have to do literal for the value anyway. Also encoding > of the common header will cause eviction of the another common header. > > So for the next interop testing, the which strategy is a way to go? > > >> Much of the algorithm you define seems reasonable to me (there are a few >> optimizations, but who cares right now? :) ). >> >> > Yep, we all know the premature optimization cause what ;) > > Best regards, > Tatsuhiro Tsujikawa > > > Would you like to raise an issue so that we can track any confusion here? >> >> -=R >> >> >> >> On Fri, Aug 23, 2013 at 10:47 AM, Tatsuhiro Tsujikawa < >> tatsuhiro.t@gmail.com> wrote: >> >>> I'm trying to figure out how the HPAC works. HPAC says that it >>> clarify the eviction and index shadowing, but I'm under the >>> impression that HPAC is still not clear how the entry in the >>> reference set is removed from the header table because of >>> eviction or substitution. This is important because, due to the >>> differential encoding, the encoder and decoder must agree with >>> the "common" headers, which may be removed from the header table >>> because of eviction or substitution. >>> >>> After several tries and error, I came up with the following >>> encoder/decoder procedures, which I hopefully think that >>> conforming to the HPAC draft (well, I may be completely wrong). >>> >>> Encoder >>> ------- >>> >>> 1. For each entry in the reference set, check that it is present >>> in the current header set. If it is not, encode it as indexed >>> representation and remove it from the reference set. >>> >>> 2. For each entry in the reference set, check that it is present >>> in the current header set. If it is present, mark the entry >>> as "common-header" and remove the matching name/value pair >>> from current header set (if multiple name/value pairs are >>> matched, only one of them is removed from the current header >>> set). >>> >>> 3. Encode the rest of name/value pair in current header set. For each >>> name/value pair: >>> >>> 3.1. If name/value pair is present in the header table, and the >>> corresponding entry in the header table is NOT in the >>> reference set, add the entry to the reference set and encode >>> it as indexed representation. Mark the entry "emitted". >>> >>> 3.2. If name/value pair is present in the header table, and the >>> corresponding entry in the header table is in the reference >>> set: If the entry is marked as "common-header", then this is >>> the 2nd occurrence of the same indexed representation. To >>> encode this name/value pair, we have to encode 4 indexed >>> representation. 2 for the 1st one (which was removed in step >>> 2), and the another 2 for the current name/value pair. >>> Unmark the entry "common-header" and mark it "emitted". >>> >>> If the entry is marked as "emitted", then this is also the >>> occurrences of the same indexed representation. But this time, >>> we just encode 2 indexed representation. >>> >>> 3.3. Otherwise, encoder encodes name/value pair as literal >>> representation. On eviction or substitution, if the removed >>> entry is in the reference set, it is removed from the >>> reference set. >>> >>> 4. After all current header set is processed, unmark all entries in >>> the header table. >>> >>> Decoder >>> ------- >>> >>> Decoder generally just performs what the encoder emitted. But to >>> handle common header gracefully with eviction, when the entry in >>> the header table is removed from the header table due to the >>> eviction or substitution, if the entry is in the reference set >>> and it is not emitted in the current header processing, emit the >>> entry on the removal. >>> >>> -- >>> >>> I implemented the above encoder/decoder procedure and it seems to >>> work. But I'm not sure it conforms to the current draft, >>> especially for the Encoder step 2 and Decoder's header emission >>> on the eviction because they are not described in the draft at >>> all. There is certainly better, correct way to go, but currently >>> I failed to see it. How do you read the draft? >>> >>> Best regards, >>> >>> Tatsuhiro Tsujikawa >>> >>> >> >
Received on Saturday, 24 August 2013 10:06:32 UTC