- From: Gábor Molnár <gabor.molnar@sch.bme.hu>
- Date: Fri, 05 Jul 2013 04:44:19 +0200
- To: Roberto Peon <grmocg@gmail.com>
- Cc: RUELLAN Herve <Herve.Ruellan@crf.canon.fr>, James M Snell <jasnell@gmail.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
- Message-id: <CA+KJw_4EQctQWRFQ=hQ8FVO8rBTGv_YoNGULBxxccQapChvcgg@mail.gmail.com>
2013/7/4 Gábor Molnár <gabor.molnar@sch.bme.hu> > Let's consider the memory management of a real implementation. When > there's an incoming "Literal Header Representation with Substitution > Indexing" , the implementation first needs to assemble the header name and > value *somewhere* in memory. After that, it probably passes around pointers > to these buffers and the header table is just an array of pointers plus > administration data. > > I think there's two interesting consequences of this: > * since the new header is already in the memory somewhere, it doesn't > really matter if we do add-then-enforce or enforce-then-add, both the old > header table and the new name-value pair has to fit into the memory > * the size of name-value pairs should be bounded as well > Just realized that bounding the size of name-value pairs is probably not an option, since HTTP does not limit header name/value length. > > I'm not very experienced in using programming languages with manual memory > management, so please correct me if that's not a reasonable model for a C > implementation for example. > > > 2013/7/4 Roberto Peon <grmocg@gmail.com> > >> The approach of add-then-enforce guarantee has a glaring and huge >> problem-- it cannot guarantee the amount of memory I'll use. >> I'd much rather deal with some small complexity (demonstrably not big) >> here and have that guarantee. >> -=R >> >> >> On Thu, Jul 4, 2013 at 6:09 AM, RUELLAN Herve <Herve.Ruellan@crf.canon.fr >> > wrote: >> >>> Trying to catch on the discussion, I see three proposals for solving the >>> problem of substitution and eviction. >>> >>> 1. Size adjustment BEFORE doing the substitution. >>> This has several edge case problems as James showed and would lead to >>> complex implementations. >>> >>> 2. Size adjustment AFTER doing the substitution. >>> The problem is that the new entry may be dropped just after adding it. >>> >>> 3. Size adjustment BEFORE doing the substitution with pinning of the >>> replaced entry (Roberto's proposal). >>> >>> >>> I think that 2 and 3 are in fact close together. >>> With 2, a bad encoder could just have its new entry dropped from the >>> table just after adding it. However a good encoder could use an algorithm >>> like the one propose in 3 to find the right entry to replace and prevent >>> dropping the new entry. >>> On the decoder side, 2 is simpler. >>> >>> Taking James' example for what a "good" encoder should do. >>> With existing table (max size 15) >>> 0: FOO = 123 >>> 1: BAR = 321 >>> 2: BA = Z >>> >>> The encoder wants to substitute: TESTING = FOO at Index #0 (because >>> entry #0 is old or whatever). >>> It detects that index #0 and #1 need to be dropped, and therefore adjust >>> the substitution to replace Index #2. >>> It then can apply the substitution, and after that adjust the size. >>> >>> >>> So my preference is for the second solution. True, it can lead to >>> under-optimal usage of the header table. But I'm not in favor of making all >>> implementations more complex to help optimize bad implementations. >>> We should however warn implementers of this problem. >>> >>> Hervé. >>> >>> >>> > -----Original Message----- >>> > From: James M Snell [mailto:jasnell@gmail.com] >>> > Sent: mercredi 3 juillet 2013 02:36 >>> > To: Roberto Peon >>> > Cc: ietf-http-wg@w3.org >>> > Subject: Re: Header Compression Clarifications >>> > >>> > Yes, I was simplifying :-) I think that rule should work.. >>> > particularly in that it allows me to avoid having to check for as many >>> of these >>> > weird edge cases. >>> > >>> > On Tue, Jul 2, 2013 at 5:27 PM, Roberto Peon <grmocg@gmail.com> wrote: >>> > > Correct (assuming the overhead per item was assumed to be zero, which >>> > > isn't the case, but is good in example-land :) ) >>> > > >>> > > -=R >>> > > >>> > > >>> > > On Tue, Jul 2, 2013 at 5:18 PM, James M Snell <jasnell@gmail.com> >>> wrote: >>> > >> >>> > >> So to make sure I have it right... Given the two examples I gave... >>> > >> >>> > >> Header Table, Max size = 15 >>> > >> >>> > >> 1 A = B >>> > >> 2 C = D >>> > >> 3 E = F >>> > >> 4 G = H >>> > >> 5 I = J >>> > >> >>> > >> Substitute #5 with FOOBARBAZ = 123456 >>> > >> >>> > >> The result would be a Header table with one item "FOOBARBAZ = >>> 123456" >>> > >> >>> > >> And... >>> > >> >>> > >> Header Table, Max size = 20 >>> > >> >>> > >> 1 A = B >>> > >> 2 C = D >>> > >> 3 E = F >>> > >> 4 G = H >>> > >> 5 I = J >>> > >> 6 K = L >>> > >> 7 M = N >>> > >> >>> > >> Substitute #3 with FOOBARBAZ = 123456 >>> > >> >>> > >> The result would be a Header table with three items... >>> > >> >>> > >> FOOBARBAZ = 123456 >>> > >> K = L >>> > >> M = N >>> > >> >>> > >> Is that correct? >>> > >> >>> > >> On Tue, Jul 2, 2013 at 5:07 PM, Roberto Peon <grmocg@gmail.com> >>> > wrote: >>> > >> > The biggest reason that I don't like this is that it requires the >>> > >> > encoder keep more state. >>> > >> > I prefer to make this simple by having an easy-to-follow rule for >>> > >> > when it the slot it would have replaced would have been evicted >>> > >> > (once all predecessors to that slot have been evicted, then >>> > >> > elements following the element-to-be-replaced are removed, leaving >>> > >> > the new element at the head of the list). >>> > >> > >>> > >> > The pseudo code for this is: >>> > >> > >>> > >> > if not replacement_idx or new_element_size > max_table_size: >>> > >> > PROTOCOL_ERROR() >>> > >> > if max_table_size ==new_element_size: >>> > >> > table.clear() >>> > >> > table[0] = new_element >>> > >> > return >>> > >> > >>> > >> > # above is boilerplate true for any algorithm >>> > >> > >>> > >> > table[replacement_idx].clear() >>> > >> > table[replacement_idx].pin() >>> > >> > first_non_pinned = 0 >>> > >> > while new_element_size + table_byte_size() > max_table_size: >>> > >> > if table[first_non_pinned].pinned(): >>> > >> > ++first_non_pinned >>> > >> > continue >>> > >> > table[first_non_pinned].pop() >>> > >> > >>> > >> > This adds some small complexity here, but it makes encoding >>> > >> > significantly easier (you can have a naive encoder which leaps >>> > >> > without looking, which is far less complicated than having to look >>> > >> > before leap, and may still prove reasonable in terms of compressor >>> > >> > efficiency). >>> > >> > >>> > >> > I admit that I'm attracted to your idea. I just am afraid of what >>> > >> > it makes the encoder look like :) -=R >>> > >> > >>> > >> > >>> > >> > On Tue, Jul 2, 2013 at 4:37 PM, James M Snell <jasnell@gmail.com> >>> > wrote: >>> > >> >> >>> > >> >> On Tue, Jul 2, 2013 at 4:00 PM, Roberto Peon <grmocg@gmail.com> >>> > wrote: >>> > >> >> [snip] >>> > >> >> > >>> > >> >> > So, an example: >>> > >> >> > Imagine that you're replacing entry #10 with something 10 >>> > >> >> > characters long. >>> > >> >> > The previous entry in that slot was 5 characters long, and the >>> > >> >> > table was already at max size. >>> > >> >> > This implies that you need to get rid of 5 characters before >>> > >> >> > replacing. >>> > >> >> > Assuming that items 1 and 2 are the oldest items and item 1 is >>> 3 >>> > >> >> > chars, and item 2 is 3 chars, you need to pop two. >>> > >> >> > >>> > >> >> > You now stick the 10 characters into what was formerly entry >>> #10. >>> > >> >> >[snip] >>> > >> >> >>> > >> >> That's problematic too. Let's go back to my example: >>> > >> >> >>> > >> >> Header Table, Max size = 15 >>> > >> >> >>> > >> >> 1 A = B >>> > >> >> 2 C = D >>> > >> >> 3 E = F >>> > >> >> 4 G = H >>> > >> >> 5 I = J >>> > >> >> >>> > >> >> Substitute #5 with FOOBARBAZ = 123456 >>> > >> >> >>> > >> >> Obviously, we end up popping all five entries, saying "stick the >>> > >> >> new characters into what was formerly entry #5" does not make any >>> > >> >> sense because the thing that was "formerly entry #5" no longer >>> exists. >>> > >> >> >>> > >> >> Now a variation on the same problem: >>> > >> >> >>> > >> >> Header Table, Max size = 20 >>> > >> >> >>> > >> >> 1 A = B >>> > >> >> 2 C = D >>> > >> >> 3 E = F >>> > >> >> 4 G = H >>> > >> >> 5 I = J >>> > >> >> 6 K = L >>> > >> >> 7 M = N >>> > >> >> >>> > >> >> Substitute #3 with FOOBARBAZ = 123456 >>> > >> >> >>> > >> >> We begin popping things off to make room before doing the >>> > >> >> substitution... 4 entries are removed, including the item being >>> > >> >> replaced... leaving >>> > >> >> >>> > >> >> 1 I = J >>> > >> >> 2 K = L >>> > >> >> 3 M = N >>> > >> >> >>> > >> >> What exactly do we replace? Are we replacing "M = N" (the current >>> > #3)? >>> > >> >> If so, how does that sync up with the "thing that was formerly >>> > >> >> entry #3" idea? >>> > >> >> >>> > >> >> I think the only reliable approach is to substitute AFTER freeing >>> > >> >> up space, substitute into whatever is in the index position after >>> > >> >> freeing up space, and if nothing is in that space, return an >>> > >> >> error. This means that the sender has to be careful to avoid >>> > >> >> getting into this state in the first place, which means very >>> > >> >> careful control over when and how substitution is being used. >>> > >> >> Given the current eviction strategy, that would be the most >>> > >> >> reliable approach I think. So in the two examples above, the >>> first >>> > >> >> case returns an error and the second case results in "M = N" >>> being >>> > replaced. >>> > >> > >>> > >> > >>> > > >>> > > >>> >>> >> >
Received on Friday, 5 July 2013 02:45:09 UTC