Re: Header Compression Clarifications

So to make sure I have it right... Given the two examples I gave...

  Header Table, Max size = 15

  1  A = B
  2  C = D
  3  E = F
  4  G = H
  5  I = J

  Substitute #5 with FOOBARBAZ = 123456

The result would be a Header table with one item "FOOBARBAZ = 123456"

And...

  Header Table, Max size = 20

  1  A = B
  2  C = D
  3  E = F
  4  G = H
  5  I = J
  6  K = L
  7  M = N

  Substitute #3 with FOOBARBAZ = 123456

The result would be a Header table with three items...

  FOOBARBAZ = 123456
  K = L
  M = N

Is that correct?

On Tue, Jul 2, 2013 at 5:07 PM, Roberto Peon <grmocg@gmail.com> wrote:
> The biggest reason that I don't like this is that it requires the encoder
> keep more state.
> I prefer to make this simple by having an easy-to-follow rule for when it
> the slot it would have replaced would have been evicted (once all
> predecessors to that slot have been evicted, then elements following the
> element-to-be-replaced are removed, leaving the new element at the head of
> the list).
>
> The pseudo code for this is:
>
> if not replacement_idx or new_element_size > max_table_size:
>   PROTOCOL_ERROR()
> if max_table_size ==new_element_size:
>   table.clear()
>   table[0] = new_element
>   return
>
> # above is boilerplate true for any algorithm
>
> table[replacement_idx].clear()
> table[replacement_idx].pin()
> first_non_pinned = 0
> while new_element_size + table_byte_size() > max_table_size:
>     if table[first_non_pinned].pinned():
>       ++first_non_pinned
>        continue
>       table[first_non_pinned].pop()
>
> This adds some small complexity here, but it makes encoding significantly
> easier (you can have a naive encoder which leaps without looking, which is
> far less complicated than having to look before leap, and may still prove
> reasonable in terms of compressor efficiency).
>
> I admit that I'm attracted to your idea. I just am afraid of what it makes
> the encoder look like :)
> -=R
>
>
> On Tue, Jul 2, 2013 at 4:37 PM, James M Snell <jasnell@gmail.com> wrote:
>>
>> On Tue, Jul 2, 2013 at 4:00 PM, Roberto Peon <grmocg@gmail.com> wrote:
>> [snip]
>> >
>> > So, an example:
>> > Imagine that you're replacing entry #10 with something 10 characters
>> > long.
>> > The previous entry in that slot was 5 characters long, and the table was
>> > already at max size.
>> > This implies that you need to get rid of 5 characters before replacing.
>> > Assuming that items 1 and 2 are the oldest items and item 1 is 3 chars,
>> > and
>> > item 2 is 3 chars, you need to pop two.
>> >
>> > You now stick the 10 characters into what was formerly entry #10.
>> >[snip]
>>
>> That's problematic too. Let's go back to my example:
>>
>> Header Table, Max size = 15
>>
>> 1  A = B
>> 2  C = D
>> 3  E = F
>> 4  G = H
>> 5  I = J
>>
>> Substitute #5 with FOOBARBAZ = 123456
>>
>> Obviously, we end up popping all five entries, saying "stick the new
>> characters into what was formerly entry #5" does not make any sense
>> because the thing that was "formerly entry #5" no longer exists.
>>
>> Now a variation on the same problem:
>>
>> Header Table, Max size = 20
>>
>> 1  A = B
>> 2  C = D
>> 3  E = F
>> 4  G = H
>> 5  I = J
>> 6  K = L
>> 7  M = N
>>
>> Substitute #3 with FOOBARBAZ = 123456
>>
>> We begin popping things off to make room before doing the
>> substitution... 4 entries are removed, including the item being
>> replaced... leaving
>>
>> 1  I = J
>> 2  K = L
>> 3  M = N
>>
>> What exactly do we replace? Are we replacing "M = N" (the current #3)?
>> If so, how does that sync up with the "thing that was formerly entry
>> #3" idea?
>>
>> I think the only reliable approach is to substitute AFTER freeing up
>> space, substitute into whatever is in the index position after freeing
>> up space, and if nothing is in that space, return an error. This means
>> that the sender has to be careful to avoid getting into this state in
>> the first place, which means very careful control over when and how
>> substitution is being used. Given the current eviction strategy, that
>> would be the most reliable approach I think. So in the two examples
>> above, the first case returns an error and the second case results in
>> "M = N" being replaced.
>
>

Received on Wednesday, 3 July 2013 00:18:50 UTC