- From: RUELLAN Herve <Herve.Ruellan@crf.canon.fr>
- Date: Thu, 4 Jul 2013 14:06:41 +0000
- To: Michael Sweet <msweet@apple.com>
- CC: James M Snell <jasnell@gmail.com>, Roberto Peon <grmocg@gmail.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
I'm not sure I understand correctly your comment. One of the goal of the eviction mechanism is to provide a solution to the possible problem of header table fragmentation over long-lasting connections: if the header table is full of very small entries, then substituting a long entry to one of these small entries will make the header table grow over its maximum allowed size. The eviction mechanism enable to allow for such a substitution while still keeping the header table under its maximum allowed size. Hervé. > -----Original Message----- > From: Michael Sweet [mailto:msweet@apple.com] > Sent: jeudi 4 juillet 2013 15:48 > To: RUELLAN Herve > Cc: James M Snell; Roberto Peon; ietf-http-wg@w3.org > Subject: Re: Header Compression Clarifications > > What about simply not allowing indexing for name/value pairs that exceed > SETTINGS_MAX_BUFFER_SIZE? Both sides could still send "long" literal > name/value pairs but they would not disrupt the header table. When > combined with #2 or #3 that should yield consistent and useful results. > > > On Jul 4, 2013, at 9:09 AM, RUELLAN Herve <Herve.Ruellan@crf.canon.fr> > wrote: > > > Trying to catch on the discussion, I see three proposals for solving the > problem of substitution and eviction. > > > > 1. Size adjustment BEFORE doing the substitution. > > This has several edge case problems as James showed and would lead to > complex implementations. > > > > 2. Size adjustment AFTER doing the substitution. > > The problem is that the new entry may be dropped just after adding it. > > > > 3. Size adjustment BEFORE doing the substitution with pinning of the > replaced entry (Roberto's proposal). > > > > > > I think that 2 and 3 are in fact close together. > > With 2, a bad encoder could just have its new entry dropped from the table > just after adding it. However a good encoder could use an algorithm like the > one propose in 3 to find the right entry to replace and prevent dropping the > new entry. > > On the decoder side, 2 is simpler. > > > > Taking James' example for what a "good" encoder should do. > > With existing table (max size 15) > > 0: FOO = 123 > > 1: BAR = 321 > > 2: BA = Z > > > > The encoder wants to substitute: TESTING = FOO at Index #0 (because > entry #0 is old or whatever). > > It detects that index #0 and #1 need to be dropped, and therefore adjust > the substitution to replace Index #2. > > It then can apply the substitution, and after that adjust the size. > > > > > > So my preference is for the second solution. True, it can lead to under- > optimal usage of the header table. But I'm not in favor of making all > implementations more complex to help optimize bad implementations. > > We should however warn implementers of this problem. > > > > Hervé. > > > > > >> -----Original Message----- > >> From: James M Snell [mailto:jasnell@gmail.com] > >> Sent: mercredi 3 juillet 2013 02:36 > >> To: Roberto Peon > >> Cc: ietf-http-wg@w3.org > >> Subject: Re: Header Compression Clarifications > >> > >> Yes, I was simplifying :-) I think that rule should work.. > >> particularly in that it allows me to avoid having to check for as > >> many of these weird edge cases. > >> > >> On Tue, Jul 2, 2013 at 5:27 PM, Roberto Peon <grmocg@gmail.com> > wrote: > >>> Correct (assuming the overhead per item was assumed to be zero, > >>> which isn't the case, but is good in example-land :) ) > >>> > >>> -=R > >>> > >>> > >>> On Tue, Jul 2, 2013 at 5:18 PM, James M Snell <jasnell@gmail.com> > wrote: > >>>> > >>>> So to make sure I have it right... Given the two examples I gave... > >>>> > >>>> Header Table, Max size = 15 > >>>> > >>>> 1 A = B > >>>> 2 C = D > >>>> 3 E = F > >>>> 4 G = H > >>>> 5 I = J > >>>> > >>>> Substitute #5 with FOOBARBAZ = 123456 > >>>> > >>>> The result would be a Header table with one item "FOOBARBAZ = > 123456" > >>>> > >>>> And... > >>>> > >>>> Header Table, Max size = 20 > >>>> > >>>> 1 A = B > >>>> 2 C = D > >>>> 3 E = F > >>>> 4 G = H > >>>> 5 I = J > >>>> 6 K = L > >>>> 7 M = N > >>>> > >>>> Substitute #3 with FOOBARBAZ = 123456 > >>>> > >>>> The result would be a Header table with three items... > >>>> > >>>> FOOBARBAZ = 123456 > >>>> K = L > >>>> M = N > >>>> > >>>> Is that correct? > >>>> > >>>> On Tue, Jul 2, 2013 at 5:07 PM, Roberto Peon <grmocg@gmail.com> > >> wrote: > >>>>> The biggest reason that I don't like this is that it requires the > >>>>> encoder keep more state. > >>>>> I prefer to make this simple by having an easy-to-follow rule for > >>>>> when it the slot it would have replaced would have been evicted > >>>>> (once all predecessors to that slot have been evicted, then > >>>>> elements following the element-to-be-replaced are removed, leaving > >>>>> the new element at the head of the list). > >>>>> > >>>>> The pseudo code for this is: > >>>>> > >>>>> if not replacement_idx or new_element_size > max_table_size: > >>>>> PROTOCOL_ERROR() > >>>>> if max_table_size ==new_element_size: > >>>>> table.clear() > >>>>> table[0] = new_element > >>>>> return > >>>>> > >>>>> # above is boilerplate true for any algorithm > >>>>> > >>>>> table[replacement_idx].clear() > >>>>> table[replacement_idx].pin() > >>>>> first_non_pinned = 0 > >>>>> while new_element_size + table_byte_size() > max_table_size: > >>>>> if table[first_non_pinned].pinned(): > >>>>> ++first_non_pinned > >>>>> continue > >>>>> table[first_non_pinned].pop() > >>>>> > >>>>> This adds some small complexity here, but it makes encoding > >>>>> significantly easier (you can have a naive encoder which leaps > >>>>> without looking, which is far less complicated than having to look > >>>>> before leap, and may still prove reasonable in terms of compressor > >>>>> efficiency). > >>>>> > >>>>> I admit that I'm attracted to your idea. I just am afraid of what > >>>>> it makes the encoder look like :) -=R > >>>>> > >>>>> > >>>>> On Tue, Jul 2, 2013 at 4:37 PM, James M Snell <jasnell@gmail.com> > >> wrote: > >>>>>> > >>>>>> On Tue, Jul 2, 2013 at 4:00 PM, Roberto Peon <grmocg@gmail.com> > >> wrote: > >>>>>> [snip] > >>>>>>> > >>>>>>> So, an example: > >>>>>>> Imagine that you're replacing entry #10 with something 10 > >>>>>>> characters long. > >>>>>>> The previous entry in that slot was 5 characters long, and the > >>>>>>> table was already at max size. > >>>>>>> This implies that you need to get rid of 5 characters before > >>>>>>> replacing. > >>>>>>> Assuming that items 1 and 2 are the oldest items and item 1 is 3 > >>>>>>> chars, and item 2 is 3 chars, you need to pop two. > >>>>>>> > >>>>>>> You now stick the 10 characters into what was formerly entry #10. > >>>>>>> [snip] > >>>>>> > >>>>>> That's problematic too. Let's go back to my example: > >>>>>> > >>>>>> Header Table, Max size = 15 > >>>>>> > >>>>>> 1 A = B > >>>>>> 2 C = D > >>>>>> 3 E = F > >>>>>> 4 G = H > >>>>>> 5 I = J > >>>>>> > >>>>>> Substitute #5 with FOOBARBAZ = 123456 > >>>>>> > >>>>>> Obviously, we end up popping all five entries, saying "stick the > >>>>>> new characters into what was formerly entry #5" does not make any > >>>>>> sense because the thing that was "formerly entry #5" no longer > exists. > >>>>>> > >>>>>> Now a variation on the same problem: > >>>>>> > >>>>>> Header Table, Max size = 20 > >>>>>> > >>>>>> 1 A = B > >>>>>> 2 C = D > >>>>>> 3 E = F > >>>>>> 4 G = H > >>>>>> 5 I = J > >>>>>> 6 K = L > >>>>>> 7 M = N > >>>>>> > >>>>>> Substitute #3 with FOOBARBAZ = 123456 > >>>>>> > >>>>>> We begin popping things off to make room before doing the > >>>>>> substitution... 4 entries are removed, including the item being > >>>>>> replaced... leaving > >>>>>> > >>>>>> 1 I = J > >>>>>> 2 K = L > >>>>>> 3 M = N > >>>>>> > >>>>>> What exactly do we replace? Are we replacing "M = N" (the current > >> #3)? > >>>>>> If so, how does that sync up with the "thing that was formerly > >>>>>> entry #3" idea? > >>>>>> > >>>>>> I think the only reliable approach is to substitute AFTER freeing > >>>>>> up space, substitute into whatever is in the index position after > >>>>>> freeing up space, and if nothing is in that space, return an > >>>>>> error. This means that the sender has to be careful to avoid > >>>>>> getting into this state in the first place, which means very > >>>>>> careful control over when and how substitution is being used. > >>>>>> Given the current eviction strategy, that would be the most > >>>>>> reliable approach I think. So in the two examples above, the > >>>>>> first case returns an error and the second case results in "M = > >>>>>> N" being > >> replaced. > >>>>> > >>>>> > >>> > >>> > > > > _________________________________________________________ > Michael Sweet, Senior Printing System Engineer, PWG Chair
Received on Thursday, 4 July 2013 14:07:21 UTC