- From: Michael Sweet <msweet@apple.com>
- Date: Thu, 04 Jul 2013 11:08:57 -0400
- To: James M Snell <jasnell@gmail.com>
- Cc: RUELLAN Herve <Herve.Ruellan@crf.canon.fr>, Roberto Peon <grmocg@gmail.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
- Message-id: <E295F403-8EC7-4946-A61B-970002DF8EB2@apple.com>
I agree, but we don't talk about errors in the current draft nor do we talk about what the client or server does to recover... As a first step it would be useful to say what you do for such headers... On Jul 4, 2013, at 10:56 AM, James M Snell <jasnell@gmail.com> wrote: > Adding a header that is too large for the table to hold individually after eviction ought to be a compression error. > > On Jul 4, 2013 7:47 AM, "Michael Sweet" <msweet@apple.com> wrote: > Hervé, > > On Jul 4, 2013, at 10:06 AM, RUELLAN Herve <Herve.Ruellan@crf.canon.fr> wrote: > > I'm not sure I understand correctly your comment. > > > > One of the goal of the eviction mechanism is to provide a solution to the possible problem of header table fragmentation over long-lasting connections: if the header table is full of very small entries, then substituting a long entry to one of these small entries will make the header table grow over its maximum allowed size. The eviction mechanism enable to allow for such a substitution while still keeping the header table under its maximum allowed size. > > Right, I am just addressing the "I want to add a name/value pair that is 100 bytes long but the header table size is 50 bytes." - what happens if you try substitution/incremental indexing with a name/value pair that is larger than the header table? > > We *could* just say that doing so empties the header table and doesn't add the pair to it (because it won't fit), but I don't think that is particularly useful. > > Instead I am suggesting that such name/value pairs must always use the Literal Header without Indexing representation - that allows for header name reuse (e.g. cookie) and otherwise preserves the header table. Otherwise I see certain headers (e.g. cookie) effectively killing any benefits of the header compression scheme. > > > > > > Hervé. > > > >> -----Original Message----- > >> From: Michael Sweet [mailto:msweet@apple.com] > >> Sent: jeudi 4 juillet 2013 15:48 > >> To: RUELLAN Herve > >> Cc: James M Snell; Roberto Peon; ietf-http-wg@w3.org > >> Subject: Re: Header Compression Clarifications > >> > >> What about simply not allowing indexing for name/value pairs that exceed > >> SETTINGS_MAX_BUFFER_SIZE? Both sides could still send "long" literal > >> name/value pairs but they would not disrupt the header table. When > >> combined with #2 or #3 that should yield consistent and useful results. > >> > >> > >> On Jul 4, 2013, at 9:09 AM, RUELLAN Herve <Herve.Ruellan@crf.canon.fr> > >> wrote: > >> > >>> Trying to catch on the discussion, I see three proposals for solving the > >> problem of substitution and eviction. > >>> > >>> 1. Size adjustment BEFORE doing the substitution. > >>> This has several edge case problems as James showed and would lead to > >> complex implementations. > >>> > >>> 2. Size adjustment AFTER doing the substitution. > >>> The problem is that the new entry may be dropped just after adding it. > >>> > >>> 3. Size adjustment BEFORE doing the substitution with pinning of the > >> replaced entry (Roberto's proposal). > >>> > >>> > >>> I think that 2 and 3 are in fact close together. > >>> With 2, a bad encoder could just have its new entry dropped from the table > >> just after adding it. However a good encoder could use an algorithm like the > >> one propose in 3 to find the right entry to replace and prevent dropping the > >> new entry. > >>> On the decoder side, 2 is simpler. > >>> > >>> Taking James' example for what a "good" encoder should do. > >>> With existing table (max size 15) > >>> 0: FOO = 123 > >>> 1: BAR = 321 > >>> 2: BA = Z > >>> > >>> The encoder wants to substitute: TESTING = FOO at Index #0 (because > >> entry #0 is old or whatever). > >>> It detects that index #0 and #1 need to be dropped, and therefore adjust > >> the substitution to replace Index #2. > >>> It then can apply the substitution, and after that adjust the size. > >>> > >>> > >>> So my preference is for the second solution. True, it can lead to under- > >> optimal usage of the header table. But I'm not in favor of making all > >> implementations more complex to help optimize bad implementations. > >>> We should however warn implementers of this problem. > >>> > >>> Hervé. > >>> > >>> > >>>> -----Original Message----- > >>>> From: James M Snell [mailto:jasnell@gmail.com] > >>>> Sent: mercredi 3 juillet 2013 02:36 > >>>> To: Roberto Peon > >>>> Cc: ietf-http-wg@w3.org > >>>> Subject: Re: Header Compression Clarifications > >>>> > >>>> Yes, I was simplifying :-) I think that rule should work.. > >>>> particularly in that it allows me to avoid having to check for as > >>>> many of these weird edge cases. > >>>> > >>>> On Tue, Jul 2, 2013 at 5:27 PM, Roberto Peon <grmocg@gmail.com> > >> wrote: > >>>>> Correct (assuming the overhead per item was assumed to be zero, > >>>>> which isn't the case, but is good in example-land :) ) > >>>>> > >>>>> -=R > >>>>> > >>>>> > >>>>> On Tue, Jul 2, 2013 at 5:18 PM, James M Snell <jasnell@gmail.com> > >> wrote: > >>>>>> > >>>>>> So to make sure I have it right... Given the two examples I gave... > >>>>>> > >>>>>> Header Table, Max size = 15 > >>>>>> > >>>>>> 1 A = B > >>>>>> 2 C = D > >>>>>> 3 E = F > >>>>>> 4 G = H > >>>>>> 5 I = J > >>>>>> > >>>>>> Substitute #5 with FOOBARBAZ = 123456 > >>>>>> > >>>>>> The result would be a Header table with one item "FOOBARBAZ = > >> 123456" > >>>>>> > >>>>>> And... > >>>>>> > >>>>>> Header Table, Max size = 20 > >>>>>> > >>>>>> 1 A = B > >>>>>> 2 C = D > >>>>>> 3 E = F > >>>>>> 4 G = H > >>>>>> 5 I = J > >>>>>> 6 K = L > >>>>>> 7 M = N > >>>>>> > >>>>>> Substitute #3 with FOOBARBAZ = 123456 > >>>>>> > >>>>>> The result would be a Header table with three items... > >>>>>> > >>>>>> FOOBARBAZ = 123456 > >>>>>> K = L > >>>>>> M = N > >>>>>> > >>>>>> Is that correct? > >>>>>> > >>>>>> On Tue, Jul 2, 2013 at 5:07 PM, Roberto Peon <grmocg@gmail.com> > >>>> wrote: > >>>>>>> The biggest reason that I don't like this is that it requires the > >>>>>>> encoder keep more state. > >>>>>>> I prefer to make this simple by having an easy-to-follow rule for > >>>>>>> when it the slot it would have replaced would have been evicted > >>>>>>> (once all predecessors to that slot have been evicted, then > >>>>>>> elements following the element-to-be-replaced are removed, leaving > >>>>>>> the new element at the head of the list). > >>>>>>> > >>>>>>> The pseudo code for this is: > >>>>>>> > >>>>>>> if not replacement_idx or new_element_size > max_table_size: > >>>>>>> PROTOCOL_ERROR() > >>>>>>> if max_table_size ==new_element_size: > >>>>>>> table.clear() > >>>>>>> table[0] = new_element > >>>>>>> return > >>>>>>> > >>>>>>> # above is boilerplate true for any algorithm > >>>>>>> > >>>>>>> table[replacement_idx].clear() > >>>>>>> table[replacement_idx].pin() > >>>>>>> first_non_pinned = 0 > >>>>>>> while new_element_size + table_byte_size() > max_table_size: > >>>>>>> if table[first_non_pinned].pinned(): > >>>>>>> ++first_non_pinned > >>>>>>> continue > >>>>>>> table[first_non_pinned].pop() > >>>>>>> > >>>>>>> This adds some small complexity here, but it makes encoding > >>>>>>> significantly easier (you can have a naive encoder which leaps > >>>>>>> without looking, which is far less complicated than having to look > >>>>>>> before leap, and may still prove reasonable in terms of compressor > >>>>>>> efficiency). > >>>>>>> > >>>>>>> I admit that I'm attracted to your idea. I just am afraid of what > >>>>>>> it makes the encoder look like :) -=R > >>>>>>> > >>>>>>> > >>>>>>> On Tue, Jul 2, 2013 at 4:37 PM, James M Snell <jasnell@gmail.com> > >>>> wrote: > >>>>>>>> > >>>>>>>> On Tue, Jul 2, 2013 at 4:00 PM, Roberto Peon <grmocg@gmail.com> > >>>> wrote: > >>>>>>>> [snip] > >>>>>>>>> > >>>>>>>>> So, an example: > >>>>>>>>> Imagine that you're replacing entry #10 with something 10 > >>>>>>>>> characters long. > >>>>>>>>> The previous entry in that slot was 5 characters long, and the > >>>>>>>>> table was already at max size. > >>>>>>>>> This implies that you need to get rid of 5 characters before > >>>>>>>>> replacing. > >>>>>>>>> Assuming that items 1 and 2 are the oldest items and item 1 is 3 > >>>>>>>>> chars, and item 2 is 3 chars, you need to pop two. > >>>>>>>>> > >>>>>>>>> You now stick the 10 characters into what was formerly entry #10. > >>>>>>>>> [snip] > >>>>>>>> > >>>>>>>> That's problematic too. Let's go back to my example: > >>>>>>>> > >>>>>>>> Header Table, Max size = 15 > >>>>>>>> > >>>>>>>> 1 A = B > >>>>>>>> 2 C = D > >>>>>>>> 3 E = F > >>>>>>>> 4 G = H > >>>>>>>> 5 I = J > >>>>>>>> > >>>>>>>> Substitute #5 with FOOBARBAZ = 123456 > >>>>>>>> > >>>>>>>> Obviously, we end up popping all five entries, saying "stick the > >>>>>>>> new characters into what was formerly entry #5" does not make any > >>>>>>>> sense because the thing that was "formerly entry #5" no longer > >> exists. > >>>>>>>> > >>>>>>>> Now a variation on the same problem: > >>>>>>>> > >>>>>>>> Header Table, Max size = 20 > >>>>>>>> > >>>>>>>> 1 A = B > >>>>>>>> 2 C = D > >>>>>>>> 3 E = F > >>>>>>>> 4 G = H > >>>>>>>> 5 I = J > >>>>>>>> 6 K = L > >>>>>>>> 7 M = N > >>>>>>>> > >>>>>>>> Substitute #3 with FOOBARBAZ = 123456 > >>>>>>>> > >>>>>>>> We begin popping things off to make room before doing the > >>>>>>>> substitution... 4 entries are removed, including the item being > >>>>>>>> replaced... leaving > >>>>>>>> > >>>>>>>> 1 I = J > >>>>>>>> 2 K = L > >>>>>>>> 3 M = N > >>>>>>>> > >>>>>>>> What exactly do we replace? Are we replacing "M = N" (the current > >>>> #3)? > >>>>>>>> If so, how does that sync up with the "thing that was formerly > >>>>>>>> entry #3" idea? > >>>>>>>> > >>>>>>>> I think the only reliable approach is to substitute AFTER freeing > >>>>>>>> up space, substitute into whatever is in the index position after > >>>>>>>> freeing up space, and if nothing is in that space, return an > >>>>>>>> error. This means that the sender has to be careful to avoid > >>>>>>>> getting into this state in the first place, which means very > >>>>>>>> careful control over when and how substitution is being used. > >>>>>>>> Given the current eviction strategy, that would be the most > >>>>>>>> reliable approach I think. So in the two examples above, the > >>>>>>>> first case returns an error and the second case results in "M = > >>>>>>>> N" being > >>>> replaced. > >>>>>>> > >>>>>>> > >>>>> > >>>>> > >>> > >> > >> _________________________________________________________ > >> Michael Sweet, Senior Printing System Engineer, PWG Chair > > > > _________________________________________________________ > Michael Sweet, Senior Printing System Engineer, PWG Chair > _________________________________________________________ Michael Sweet, Senior Printing System Engineer, PWG Chair
Received on Thursday, 4 July 2013 15:09:46 UTC