Re: Header Compression Clarifications

The biggest reason that I don't like this is that it requires the encoder
keep more state.
I prefer to make this simple by having an easy-to-follow rule for when it
the slot it would have replaced would have been evicted (once all
predecessors to that slot have been evicted, then elements following the
element-to-be-replaced are removed, leaving the new element at the head of
the list).

The pseudo code for this is:

if not replacement_idx or new_element_size > max_table_size:
  PROTOCOL_ERROR()
if max_table_size ==new_element_size:
  table.clear()
  table[0] = new_element
  return

# above is boilerplate true for any algorithm

table[replacement_idx].clear()
table[replacement_idx].pin()
first_non_pinned = 0
while new_element_size + table_byte_size() > max_table_size:
    if table[first_non_pinned].pinned():
      ++first_non_pinned
       continue
      table[first_non_pinned].pop()

This adds some small complexity here, but it makes encoding significantly
easier (you can have a naive encoder which leaps without looking, which is
far less complicated than having to look before leap, and may still prove
reasonable in terms of compressor efficiency).

I admit that I'm attracted to your idea. I just am afraid of what it makes
the encoder look like :)
-=R


On Tue, Jul 2, 2013 at 4:37 PM, James M Snell <jasnell@gmail.com> wrote:

> On Tue, Jul 2, 2013 at 4:00 PM, Roberto Peon <grmocg@gmail.com> wrote:
> [snip]
> >
> > So, an example:
> > Imagine that you're replacing entry #10 with something 10 characters
> long.
> > The previous entry in that slot was 5 characters long, and the table was
> > already at max size.
> > This implies that you need to get rid of 5 characters before replacing.
> > Assuming that items 1 and 2 are the oldest items and item 1 is 3 chars,
> and
> > item 2 is 3 chars, you need to pop two.
> >
> > You now stick the 10 characters into what was formerly entry #10.
> >[snip]
>
> That's problematic too. Let's go back to my example:
>
> Header Table, Max size = 15
>
> 1  A = B
> 2  C = D
> 3  E = F
> 4  G = H
> 5  I = J
>
> Substitute #5 with FOOBARBAZ = 123456
>
> Obviously, we end up popping all five entries, saying "stick the new
> characters into what was formerly entry #5" does not make any sense
> because the thing that was "formerly entry #5" no longer exists.
>
> Now a variation on the same problem:
>
> Header Table, Max size = 20
>
> 1  A = B
> 2  C = D
> 3  E = F
> 4  G = H
> 5  I = J
> 6  K = L
> 7  M = N
>
> Substitute #3 with FOOBARBAZ = 123456
>
> We begin popping things off to make room before doing the
> substitution... 4 entries are removed, including the item being
> replaced... leaving
>
> 1  I = J
> 2  K = L
> 3  M = N
>
> What exactly do we replace? Are we replacing "M = N" (the current #3)?
> If so, how does that sync up with the "thing that was formerly entry
> #3" idea?
>
> I think the only reliable approach is to substitute AFTER freeing up
> space, substitute into whatever is in the index position after freeing
> up space, and if nothing is in that space, return an error. This means
> that the sender has to be careful to avoid getting into this state in
> the first place, which means very careful control over when and how
> substitution is being used. Given the current eviction strategy, that
> would be the most reliable approach I think. So in the two examples
> above, the first case returns an error and the second case results in
> "M = N" being replaced.
>

Received on Wednesday, 3 July 2013 00:08:09 UTC