Re: Consequences of removing the reference set from Roberto Peon on 2014-07-25 (ietf-http-wg@w3.org from July to September 2014)

From: Roberto Peon <grmocg@gmail.com>
Date: Fri, 25 Jul 2014 10:55:20 -0700
To: Martin Thomson <martin.thomson@gmail.com>
Cc: Martin Nilsson <nilsson@opera.com>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CAP+FsNc6sYObQ4Vz8r519qv0rHJa3LNh71atXDfKgjwv51-UVw@mail.gmail.com>

So, we probably need to regenerate the huffman encoding (assuming we care)
because we've changed how the compressor works, and thus changed the
frequencies of various things that will appear.

The huffman table offers clues about the frequency of concatenating various
key's values together:

  0)  |11111111|11000                             1ff8  [13]

I'd assume that this is mainly from set-cookie, but the nature of the input
is such that we cannot know (it was computed in aggregate).
At that size, it is likely to be cheaper to encode set-cookie without using
'\0', and so I think there is some evidence that Martin's change is neutral.

I really do hate that we have no idea of what these all of these changes
will actually do, btw.

-=R

On Fri, Jul 25, 2014 at 7:37 AM, Martin Thomson <martin.thomson@gmail.com>
wrote:

> On 24 July 2014 21:59, Martin Nilsson <nilsson@opera.com> wrote:
> > In my mobile testdata I have 2086062 fields in 203586 headers with the
> > following duplicates
>
> Assuming that all those headers were encoded based on the a previous
> request, the incremental cost of having to index the extra name-value
> pairs is 10k (at 2 bytes per reference) at the absolute extreme if we
> remove '\0'.  That's generous in the extreme.  Any that appear once do
> not benefit from '\0'; any that appear on different connections do not
> benefit; and any that have changing values do not benefit.
>
> If you really care, a header field that uses valid syntax can be
> concatenated with a comma.
>
>

Received on Friday, 25 July 2014 18:01:23 UTC