- From: Gábor Molnár <gabor.molnar@sch.bme.hu>
- Date: Thu, 04 Jul 2013 23:28:16 +0200
- To: Roberto Peon <grmocg@gmail.com>
- Cc: RUELLAN Herve <Herve.Ruellan@crf.canon.fr>, James M Snell <jasnell@gmail.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
- Message-id: <CA+KJw_5G3gzEQZCB3jVA0_OYT6r127kvif2YyRcK8CFcbsXBQA@mail.gmail.com>
Let's consider the memory management of a real implementation. When there's an incoming "Literal Header Representation with Substitution Indexing" , the implementation first needs to assemble the header name and value *somewhere* in memory. After that, it probably passes around pointers to these buffers and the header table is just an array of pointers plus administration data. I think there's two interesting consequences of this: * since the new header is already in the memory somewhere, it doesn't really matter if we do add-then-enforce or enforce-then-add, both the old header table and the new name-value pair has to fit into the memory * the size of name-value pairs should be bounded as well I'm not very experienced in using programming languages with manual memory management, so please correct me if that's not a reasonable model for a C implementation for example. 2013/7/4 Roberto Peon <grmocg@gmail.com> > The approach of add-then-enforce guarantee has a glaring and huge > problem-- it cannot guarantee the amount of memory I'll use. > I'd much rather deal with some small complexity (demonstrably not big) > here and have that guarantee. > -=R > > > On Thu, Jul 4, 2013 at 6:09 AM, RUELLAN Herve <Herve.Ruellan@crf.canon.fr>wrote: > >> Trying to catch on the discussion, I see three proposals for solving the >> problem of substitution and eviction. >> >> 1. Size adjustment BEFORE doing the substitution. >> This has several edge case problems as James showed and would lead to >> complex implementations. >> >> 2. Size adjustment AFTER doing the substitution. >> The problem is that the new entry may be dropped just after adding it. >> >> 3. Size adjustment BEFORE doing the substitution with pinning of the >> replaced entry (Roberto's proposal). >> >> >> I think that 2 and 3 are in fact close together. >> With 2, a bad encoder could just have its new entry dropped from the >> table just after adding it. However a good encoder could use an algorithm >> like the one propose in 3 to find the right entry to replace and prevent >> dropping the new entry. >> On the decoder side, 2 is simpler. >> >> Taking James' example for what a "good" encoder should do. >> With existing table (max size 15) >> 0: FOO = 123 >> 1: BAR = 321 >> 2: BA = Z >> >> The encoder wants to substitute: TESTING = FOO at Index #0 (because entry >> #0 is old or whatever). >> It detects that index #0 and #1 need to be dropped, and therefore adjust >> the substitution to replace Index #2. >> It then can apply the substitution, and after that adjust the size. >> >> >> So my preference is for the second solution. True, it can lead to >> under-optimal usage of the header table. But I'm not in favor of making all >> implementations more complex to help optimize bad implementations. >> We should however warn implementers of this problem. >> >> Hervé. >> >> >> > -----Original Message----- >> > From: James M Snell [mailto:jasnell@gmail.com] >> > Sent: mercredi 3 juillet 2013 02:36 >> > To: Roberto Peon >> > Cc: ietf-http-wg@w3.org >> > Subject: Re: Header Compression Clarifications >> > >> > Yes, I was simplifying :-) I think that rule should work.. >> > particularly in that it allows me to avoid having to check for as many >> of these >> > weird edge cases. >> > >> > On Tue, Jul 2, 2013 at 5:27 PM, Roberto Peon <grmocg@gmail.com> wrote: >> > > Correct (assuming the overhead per item was assumed to be zero, which >> > > isn't the case, but is good in example-land :) ) >> > > >> > > -=R >> > > >> > > >> > > On Tue, Jul 2, 2013 at 5:18 PM, James M Snell <jasnell@gmail.com> >> wrote: >> > >> >> > >> So to make sure I have it right... Given the two examples I gave... >> > >> >> > >> Header Table, Max size = 15 >> > >> >> > >> 1 A = B >> > >> 2 C = D >> > >> 3 E = F >> > >> 4 G = H >> > >> 5 I = J >> > >> >> > >> Substitute #5 with FOOBARBAZ = 123456 >> > >> >> > >> The result would be a Header table with one item "FOOBARBAZ = 123456" >> > >> >> > >> And... >> > >> >> > >> Header Table, Max size = 20 >> > >> >> > >> 1 A = B >> > >> 2 C = D >> > >> 3 E = F >> > >> 4 G = H >> > >> 5 I = J >> > >> 6 K = L >> > >> 7 M = N >> > >> >> > >> Substitute #3 with FOOBARBAZ = 123456 >> > >> >> > >> The result would be a Header table with three items... >> > >> >> > >> FOOBARBAZ = 123456 >> > >> K = L >> > >> M = N >> > >> >> > >> Is that correct? >> > >> >> > >> On Tue, Jul 2, 2013 at 5:07 PM, Roberto Peon <grmocg@gmail.com> >> > wrote: >> > >> > The biggest reason that I don't like this is that it requires the >> > >> > encoder keep more state. >> > >> > I prefer to make this simple by having an easy-to-follow rule for >> > >> > when it the slot it would have replaced would have been evicted >> > >> > (once all predecessors to that slot have been evicted, then >> > >> > elements following the element-to-be-replaced are removed, leaving >> > >> > the new element at the head of the list). >> > >> > >> > >> > The pseudo code for this is: >> > >> > >> > >> > if not replacement_idx or new_element_size > max_table_size: >> > >> > PROTOCOL_ERROR() >> > >> > if max_table_size ==new_element_size: >> > >> > table.clear() >> > >> > table[0] = new_element >> > >> > return >> > >> > >> > >> > # above is boilerplate true for any algorithm >> > >> > >> > >> > table[replacement_idx].clear() >> > >> > table[replacement_idx].pin() >> > >> > first_non_pinned = 0 >> > >> > while new_element_size + table_byte_size() > max_table_size: >> > >> > if table[first_non_pinned].pinned(): >> > >> > ++first_non_pinned >> > >> > continue >> > >> > table[first_non_pinned].pop() >> > >> > >> > >> > This adds some small complexity here, but it makes encoding >> > >> > significantly easier (you can have a naive encoder which leaps >> > >> > without looking, which is far less complicated than having to look >> > >> > before leap, and may still prove reasonable in terms of compressor >> > >> > efficiency). >> > >> > >> > >> > I admit that I'm attracted to your idea. I just am afraid of what >> > >> > it makes the encoder look like :) -=R >> > >> > >> > >> > >> > >> > On Tue, Jul 2, 2013 at 4:37 PM, James M Snell <jasnell@gmail.com> >> > wrote: >> > >> >> >> > >> >> On Tue, Jul 2, 2013 at 4:00 PM, Roberto Peon <grmocg@gmail.com> >> > wrote: >> > >> >> [snip] >> > >> >> > >> > >> >> > So, an example: >> > >> >> > Imagine that you're replacing entry #10 with something 10 >> > >> >> > characters long. >> > >> >> > The previous entry in that slot was 5 characters long, and the >> > >> >> > table was already at max size. >> > >> >> > This implies that you need to get rid of 5 characters before >> > >> >> > replacing. >> > >> >> > Assuming that items 1 and 2 are the oldest items and item 1 is 3 >> > >> >> > chars, and item 2 is 3 chars, you need to pop two. >> > >> >> > >> > >> >> > You now stick the 10 characters into what was formerly entry >> #10. >> > >> >> >[snip] >> > >> >> >> > >> >> That's problematic too. Let's go back to my example: >> > >> >> >> > >> >> Header Table, Max size = 15 >> > >> >> >> > >> >> 1 A = B >> > >> >> 2 C = D >> > >> >> 3 E = F >> > >> >> 4 G = H >> > >> >> 5 I = J >> > >> >> >> > >> >> Substitute #5 with FOOBARBAZ = 123456 >> > >> >> >> > >> >> Obviously, we end up popping all five entries, saying "stick the >> > >> >> new characters into what was formerly entry #5" does not make any >> > >> >> sense because the thing that was "formerly entry #5" no longer >> exists. >> > >> >> >> > >> >> Now a variation on the same problem: >> > >> >> >> > >> >> Header Table, Max size = 20 >> > >> >> >> > >> >> 1 A = B >> > >> >> 2 C = D >> > >> >> 3 E = F >> > >> >> 4 G = H >> > >> >> 5 I = J >> > >> >> 6 K = L >> > >> >> 7 M = N >> > >> >> >> > >> >> Substitute #3 with FOOBARBAZ = 123456 >> > >> >> >> > >> >> We begin popping things off to make room before doing the >> > >> >> substitution... 4 entries are removed, including the item being >> > >> >> replaced... leaving >> > >> >> >> > >> >> 1 I = J >> > >> >> 2 K = L >> > >> >> 3 M = N >> > >> >> >> > >> >> What exactly do we replace? Are we replacing "M = N" (the current >> > #3)? >> > >> >> If so, how does that sync up with the "thing that was formerly >> > >> >> entry #3" idea? >> > >> >> >> > >> >> I think the only reliable approach is to substitute AFTER freeing >> > >> >> up space, substitute into whatever is in the index position after >> > >> >> freeing up space, and if nothing is in that space, return an >> > >> >> error. This means that the sender has to be careful to avoid >> > >> >> getting into this state in the first place, which means very >> > >> >> careful control over when and how substitution is being used. >> > >> >> Given the current eviction strategy, that would be the most >> > >> >> reliable approach I think. So in the two examples above, the first >> > >> >> case returns an error and the second case results in "M = N" being >> > replaced. >> > >> > >> > >> > >> > > >> > > >> >> >
Received on Thursday, 4 July 2013 21:29:07 UTC