RE: Header Compression Clarifications from RUELLAN Herve on 2013-07-05 (ietf-http-wg@w3.org from July to September 2013)

From: RUELLAN Herve <Herve.Ruellan@crf.canon.fr>
Date: Fri, 5 Jul 2013 08:35:16 +0000
To: Roberto Peon <grmocg@gmail.com>
CC: James M Snell <jasnell@gmail.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Message-ID: <6C71876BDCCD01488E70A2399529D5E525EE42DB@ADELE.crf.canon.fr>
If you need to guarantee the amount of memory used in the add-then-enforce approach, you can use an algorithm similar to the one you proposed:

if not replacement_idx or new_element_size > max_table_size:
  PROTOCOL_ERROR()
if max_table_size ==new_element_size:
  table.clear()
  table[0] = new_element
  return

# above is boilerplate true for any algorithm

table[replacement_idx].clear()

first_non_pinned = 0
while new_element_size + table_byte_size() > max_table_size:
  table[first_non_pinned].pop()
  if first_non_pinned == replacement_idx:
    new_element.clear()
    break # Should not insert new_element in table

So the question is: do we want this small complexity in all cases, or only for memory constrained cases?

Hervé.

> -----Original Message-----
> From: Roberto Peon [mailto:grmocg@gmail.com]
> Sent: jeudi 4 juillet 2013 20:07
> To: RUELLAN Herve
> Cc: James M Snell; ietf-http-wg@w3.org
> Subject: Re: Header Compression Clarifications
> 
> The approach of add-then-enforce guarantee has a glaring and huge
> problem--  it cannot guarantee the amount of memory I'll use.
> I'd much rather deal with some small complexity (demonstrably not big) here
> and have that guarantee.
> -=R
> 
> 
> On Thu, Jul 4, 2013 at 6:09 AM, RUELLAN Herve
> <Herve.Ruellan@crf.canon.fr> wrote:
> 
> 
> 	Trying to catch on the discussion, I see three proposals for solving the
> problem of substitution and eviction.
> 
> 	1. Size adjustment BEFORE doing the substitution.
> 	This has several edge case problems as James showed and would
> lead to complex implementations.
> 
> 	2. Size adjustment AFTER doing the substitution.
> 	The problem is that the new entry may be dropped just after adding
> it.
> 
> 	3. Size adjustment BEFORE doing the substitution with pinning of the
> replaced entry (Roberto's proposal).
> 
> 
> 	I think that 2 and 3 are in fact close together.
> 	With 2, a bad encoder could just have its new entry dropped from
> the table just after adding it. However a good encoder could use an algorithm
> like the one propose in 3 to find the right entry to replace and prevent
> dropping the new entry.
> 	On the decoder side, 2 is simpler.
> 
> 	Taking James' example for what a "good" encoder should do.
> 	With existing table (max size 15)
> 
> 	  0: FOO = 123
> 	  1: BAR = 321
> 	  2: BA = Z
> 
> 
> 	The encoder wants to substitute: TESTING = FOO at Index #0
> (because entry #0 is old or whatever).
> 	It detects that index #0 and #1 need to be dropped, and therefore
> adjust the substitution to replace Index #2.
> 	It then can apply the substitution, and after that adjust the size.
> 
> 
> 	So my preference is for the second solution. True, it can lead to
> under-optimal usage of the header table. But I'm not in favor of making all
> implementations more complex to help optimize bad implementations.
> 	We should however warn implementers of this problem.
> 
> 	Hervé.
> 
> 
> 
> 	> -----Original Message-----
> 	> From: James M Snell [mailto:jasnell@gmail.com]
> 	> Sent: mercredi 3 juillet 2013 02:36
> 	> To: Roberto Peon
> 	> Cc: ietf-http-wg@w3.org
> 	> Subject: Re: Header Compression Clarifications
> 	>
> 
> 	> Yes, I was simplifying :-) I think that rule should work..
> 	> particularly in that it allows me to avoid having to check for as many
> of these
> 	> weird edge cases.
> 	>
> 	> On Tue, Jul 2, 2013 at 5:27 PM, Roberto Peon <grmocg@gmail.com>
> wrote:
> 	> > Correct (assuming the overhead per item was assumed to be
> zero, which
> 	> > isn't the case, but is good in example-land :) )
> 	> >
> 	> > -=R
> 	> >
> 	> >
> 	> > On Tue, Jul 2, 2013 at 5:18 PM, James M Snell
> <jasnell@gmail.com> wrote:
> 	> >>
> 	> >> So to make sure I have it right... Given the two examples I
> gave...
> 	> >>
> 	> >>   Header Table, Max size = 15
> 	> >>
> 	> >>   1  A = B
> 	> >>   2  C = D
> 	> >>   3  E = F
> 	> >>   4  G = H
> 	> >>   5  I = J
> 	> >>
> 	> >>   Substitute #5 with FOOBARBAZ = 123456
> 	> >>
> 	> >> The result would be a Header table with one item "FOOBARBAZ
> = 123456"
> 	> >>
> 	> >> And...
> 	> >>
> 	> >>   Header Table, Max size = 20
> 	> >>
> 	> >>   1  A = B
> 	> >>   2  C = D
> 	> >>   3  E = F
> 	> >>   4  G = H
> 	> >>   5  I = J
> 	> >>   6  K = L
> 	> >>   7  M = N
> 	> >>
> 	> >>   Substitute #3 with FOOBARBAZ = 123456
> 	> >>
> 	> >> The result would be a Header table with three items...
> 	> >>
> 	> >>   FOOBARBAZ = 123456
> 	> >>   K = L
> 	> >>   M = N
> 	> >>
> 	> >> Is that correct?
> 	> >>
> 	> >> On Tue, Jul 2, 2013 at 5:07 PM, Roberto Peon
> <grmocg@gmail.com>
> 	> wrote:
> 	> >> > The biggest reason that I don't like this is that it requires the
> 	> >> > encoder keep more state.
> 	> >> > I prefer to make this simple by having an easy-to-follow rule
> for
> 	> >> > when it the slot it would have replaced would have been
> evicted
> 	> >> > (once all predecessors to that slot have been evicted, then
> 	> >> > elements following the element-to-be-replaced are removed,
> leaving
> 	> >> > the new element at the head of the list).
> 	> >> >
> 	> >> > The pseudo code for this is:
> 	> >> >
> 	> >> > if not replacement_idx or new_element_size >
> max_table_size:
> 	> >> >   PROTOCOL_ERROR()
> 	> >> > if max_table_size ==new_element_size:
> 	> >> >   table.clear()
> 	> >> >   table[0] = new_element
> 	> >> >   return
> 	> >> >
> 	> >> > # above is boilerplate true for any algorithm
> 	> >> >
> 	> >> > table[replacement_idx].clear()
> 	> >> > table[replacement_idx].pin()
> 	> >> > first_non_pinned = 0
> 	> >> > while new_element_size + table_byte_size() >
> max_table_size:
> 	> >> >     if table[first_non_pinned].pinned():
> 	> >> >       ++first_non_pinned
> 	> >> >        continue
> 	> >> >       table[first_non_pinned].pop()
> 	> >> >
> 	> >> > This adds some small complexity here, but it makes encoding
> 	> >> > significantly easier (you can have a naive encoder which leaps
> 	> >> > without looking, which is far less complicated than having to
> look
> 	> >> > before leap, and may still prove reasonable in terms of
> compressor
> 	> >> > efficiency).
> 	> >> >
> 	> >> > I admit that I'm attracted to your idea. I just am afraid of what
> 	> >> > it makes the encoder look like :) -=R
> 	> >> >
> 	> >> >
> 	> >> > On Tue, Jul 2, 2013 at 4:37 PM, James M Snell
> <jasnell@gmail.com>
> 	> wrote:
> 	> >> >>
> 	> >> >> On Tue, Jul 2, 2013 at 4:00 PM, Roberto Peon
> <grmocg@gmail.com>
> 	> wrote:
> 	> >> >> [snip]
> 	> >> >> >
> 	> >> >> > So, an example:
> 	> >> >> > Imagine that you're replacing entry #10 with something 10
> 	> >> >> > characters long.
> 	> >> >> > The previous entry in that slot was 5 characters long, and
> the
> 	> >> >> > table was already at max size.
> 	> >> >> > This implies that you need to get rid of 5 characters before
> 	> >> >> > replacing.
> 	> >> >> > Assuming that items 1 and 2 are the oldest items and item 1
> is 3
> 	> >> >> > chars, and item 2 is 3 chars, you need to pop two.
> 	> >> >> >
> 	> >> >> > You now stick the 10 characters into what was formerly
> entry #10.
> 	> >> >> >[snip]
> 	> >> >>
> 	> >> >> That's problematic too. Let's go back to my example:
> 	> >> >>
> 	> >> >> Header Table, Max size = 15
> 	> >> >>
> 	> >> >> 1  A = B
> 	> >> >> 2  C = D
> 	> >> >> 3  E = F
> 	> >> >> 4  G = H
> 	> >> >> 5  I = J
> 	> >> >>
> 	> >> >> Substitute #5 with FOOBARBAZ = 123456
> 	> >> >>
> 	> >> >> Obviously, we end up popping all five entries, saying "stick
> the
> 	> >> >> new characters into what was formerly entry #5" does not
> make any
> 	> >> >> sense because the thing that was "formerly entry #5" no
> longer exists.
> 	> >> >>
> 	> >> >> Now a variation on the same problem:
> 	> >> >>
> 	> >> >> Header Table, Max size = 20
> 	> >> >>
> 	> >> >> 1  A = B
> 	> >> >> 2  C = D
> 	> >> >> 3  E = F
> 	> >> >> 4  G = H
> 	> >> >> 5  I = J
> 	> >> >> 6  K = L
> 	> >> >> 7  M = N
> 	> >> >>
> 	> >> >> Substitute #3 with FOOBARBAZ = 123456
> 	> >> >>
> 	> >> >> We begin popping things off to make room before doing the
> 	> >> >> substitution... 4 entries are removed, including the item
> being
> 	> >> >> replaced... leaving
> 	> >> >>
> 	> >> >> 1  I = J
> 	> >> >> 2  K = L
> 	> >> >> 3  M = N
> 	> >> >>
> 	> >> >> What exactly do we replace? Are we replacing "M = N" (the
> current
> 	> #3)?
> 	> >> >> If so, how does that sync up with the "thing that was
> formerly
> 	> >> >> entry #3" idea?
> 	> >> >>
> 	> >> >> I think the only reliable approach is to substitute AFTER
> freeing
> 	> >> >> up space, substitute into whatever is in the index position
> after
> 	> >> >> freeing up space, and if nothing is in that space, return an
> 	> >> >> error. This means that the sender has to be careful to avoid
> 	> >> >> getting into this state in the first place, which means very
> 	> >> >> careful control over when and how substitution is being
> used.
> 	> >> >> Given the current eviction strategy, that would be the most
> 	> >> >> reliable approach I think. So in the two examples above, the
> first
> 	> >> >> case returns an error and the second case results in "M = N"
> being
> 	> replaced.
> 	> >> >
> 	> >> >
> 	> >
> 	> >
> 
> 
>
Received on Friday, 5 July 2013 08:35:50 UTC