- From: Michael Sweet <msweet@apple.com>
- Date: Thu, 04 Jul 2013 09:47:50 -0400
- To: RUELLAN Herve <Herve.Ruellan@crf.canon.fr>
- Cc: James M Snell <jasnell@gmail.com>, Roberto Peon <grmocg@gmail.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
What about simply not allowing indexing for name/value pairs that exceed SETTINGS_MAX_BUFFER_SIZE? Both sides could still send "long" literal name/value pairs but they would not disrupt the header table. When combined with #2 or #3 that should yield consistent and useful results. On Jul 4, 2013, at 9:09 AM, RUELLAN Herve <Herve.Ruellan@crf.canon.fr> wrote: > Trying to catch on the discussion, I see three proposals for solving the problem of substitution and eviction. > > 1. Size adjustment BEFORE doing the substitution. > This has several edge case problems as James showed and would lead to complex implementations. > > 2. Size adjustment AFTER doing the substitution. > The problem is that the new entry may be dropped just after adding it. > > 3. Size adjustment BEFORE doing the substitution with pinning of the replaced entry (Roberto's proposal). > > > I think that 2 and 3 are in fact close together. > With 2, a bad encoder could just have its new entry dropped from the table just after adding it. However a good encoder could use an algorithm like the one propose in 3 to find the right entry to replace and prevent dropping the new entry. > On the decoder side, 2 is simpler. > > Taking James' example for what a "good" encoder should do. > With existing table (max size 15) > 0: FOO = 123 > 1: BAR = 321 > 2: BA = Z > > The encoder wants to substitute: TESTING = FOO at Index #0 (because entry #0 is old or whatever). > It detects that index #0 and #1 need to be dropped, and therefore adjust the substitution to replace Index #2. > It then can apply the substitution, and after that adjust the size. > > > So my preference is for the second solution. True, it can lead to under-optimal usage of the header table. But I'm not in favor of making all implementations more complex to help optimize bad implementations. > We should however warn implementers of this problem. > > Hervé. > > >> -----Original Message----- >> From: James M Snell [mailto:jasnell@gmail.com] >> Sent: mercredi 3 juillet 2013 02:36 >> To: Roberto Peon >> Cc: ietf-http-wg@w3.org >> Subject: Re: Header Compression Clarifications >> >> Yes, I was simplifying :-) I think that rule should work.. >> particularly in that it allows me to avoid having to check for as many of these >> weird edge cases. >> >> On Tue, Jul 2, 2013 at 5:27 PM, Roberto Peon <grmocg@gmail.com> wrote: >>> Correct (assuming the overhead per item was assumed to be zero, which >>> isn't the case, but is good in example-land :) ) >>> >>> -=R >>> >>> >>> On Tue, Jul 2, 2013 at 5:18 PM, James M Snell <jasnell@gmail.com> wrote: >>>> >>>> So to make sure I have it right... Given the two examples I gave... >>>> >>>> Header Table, Max size = 15 >>>> >>>> 1 A = B >>>> 2 C = D >>>> 3 E = F >>>> 4 G = H >>>> 5 I = J >>>> >>>> Substitute #5 with FOOBARBAZ = 123456 >>>> >>>> The result would be a Header table with one item "FOOBARBAZ = 123456" >>>> >>>> And... >>>> >>>> Header Table, Max size = 20 >>>> >>>> 1 A = B >>>> 2 C = D >>>> 3 E = F >>>> 4 G = H >>>> 5 I = J >>>> 6 K = L >>>> 7 M = N >>>> >>>> Substitute #3 with FOOBARBAZ = 123456 >>>> >>>> The result would be a Header table with three items... >>>> >>>> FOOBARBAZ = 123456 >>>> K = L >>>> M = N >>>> >>>> Is that correct? >>>> >>>> On Tue, Jul 2, 2013 at 5:07 PM, Roberto Peon <grmocg@gmail.com> >> wrote: >>>>> The biggest reason that I don't like this is that it requires the >>>>> encoder keep more state. >>>>> I prefer to make this simple by having an easy-to-follow rule for >>>>> when it the slot it would have replaced would have been evicted >>>>> (once all predecessors to that slot have been evicted, then >>>>> elements following the element-to-be-replaced are removed, leaving >>>>> the new element at the head of the list). >>>>> >>>>> The pseudo code for this is: >>>>> >>>>> if not replacement_idx or new_element_size > max_table_size: >>>>> PROTOCOL_ERROR() >>>>> if max_table_size ==new_element_size: >>>>> table.clear() >>>>> table[0] = new_element >>>>> return >>>>> >>>>> # above is boilerplate true for any algorithm >>>>> >>>>> table[replacement_idx].clear() >>>>> table[replacement_idx].pin() >>>>> first_non_pinned = 0 >>>>> while new_element_size + table_byte_size() > max_table_size: >>>>> if table[first_non_pinned].pinned(): >>>>> ++first_non_pinned >>>>> continue >>>>> table[first_non_pinned].pop() >>>>> >>>>> This adds some small complexity here, but it makes encoding >>>>> significantly easier (you can have a naive encoder which leaps >>>>> without looking, which is far less complicated than having to look >>>>> before leap, and may still prove reasonable in terms of compressor >>>>> efficiency). >>>>> >>>>> I admit that I'm attracted to your idea. I just am afraid of what >>>>> it makes the encoder look like :) -=R >>>>> >>>>> >>>>> On Tue, Jul 2, 2013 at 4:37 PM, James M Snell <jasnell@gmail.com> >> wrote: >>>>>> >>>>>> On Tue, Jul 2, 2013 at 4:00 PM, Roberto Peon <grmocg@gmail.com> >> wrote: >>>>>> [snip] >>>>>>> >>>>>>> So, an example: >>>>>>> Imagine that you're replacing entry #10 with something 10 >>>>>>> characters long. >>>>>>> The previous entry in that slot was 5 characters long, and the >>>>>>> table was already at max size. >>>>>>> This implies that you need to get rid of 5 characters before >>>>>>> replacing. >>>>>>> Assuming that items 1 and 2 are the oldest items and item 1 is 3 >>>>>>> chars, and item 2 is 3 chars, you need to pop two. >>>>>>> >>>>>>> You now stick the 10 characters into what was formerly entry #10. >>>>>>> [snip] >>>>>> >>>>>> That's problematic too. Let's go back to my example: >>>>>> >>>>>> Header Table, Max size = 15 >>>>>> >>>>>> 1 A = B >>>>>> 2 C = D >>>>>> 3 E = F >>>>>> 4 G = H >>>>>> 5 I = J >>>>>> >>>>>> Substitute #5 with FOOBARBAZ = 123456 >>>>>> >>>>>> Obviously, we end up popping all five entries, saying "stick the >>>>>> new characters into what was formerly entry #5" does not make any >>>>>> sense because the thing that was "formerly entry #5" no longer exists. >>>>>> >>>>>> Now a variation on the same problem: >>>>>> >>>>>> Header Table, Max size = 20 >>>>>> >>>>>> 1 A = B >>>>>> 2 C = D >>>>>> 3 E = F >>>>>> 4 G = H >>>>>> 5 I = J >>>>>> 6 K = L >>>>>> 7 M = N >>>>>> >>>>>> Substitute #3 with FOOBARBAZ = 123456 >>>>>> >>>>>> We begin popping things off to make room before doing the >>>>>> substitution... 4 entries are removed, including the item being >>>>>> replaced... leaving >>>>>> >>>>>> 1 I = J >>>>>> 2 K = L >>>>>> 3 M = N >>>>>> >>>>>> What exactly do we replace? Are we replacing "M = N" (the current >> #3)? >>>>>> If so, how does that sync up with the "thing that was formerly >>>>>> entry #3" idea? >>>>>> >>>>>> I think the only reliable approach is to substitute AFTER freeing >>>>>> up space, substitute into whatever is in the index position after >>>>>> freeing up space, and if nothing is in that space, return an >>>>>> error. This means that the sender has to be careful to avoid >>>>>> getting into this state in the first place, which means very >>>>>> careful control over when and how substitution is being used. >>>>>> Given the current eviction strategy, that would be the most >>>>>> reliable approach I think. So in the two examples above, the first >>>>>> case returns an error and the second case results in "M = N" being >> replaced. >>>>> >>>>> >>> >>> > _________________________________________________________ Michael Sweet, Senior Printing System Engineer, PWG Chair
Received on Thursday, 4 July 2013 13:48:23 UTC