Re: Header Table and Static Table Indicies Switched from Roberto Peon on 2014-08-02 (ietf-http-wg@w3.org from July to September 2014)

From: Roberto Peon <grmocg@gmail.com>
Date: Fri, 1 Aug 2014 17:53:47 -0700
To: Jeff Pinner <jpinner@twitter.com>
Cc: Jason Greene <jason.greene@redhat.com>, Greg Wilkins <gregw@intalio.com>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CAP+FsNcAQwbKbTOcc=Uw+Q3zdsVQ4dwymKvj0e+Zv78JG4yQtQ@mail.gmail.com>

I don't recall that particular change being something I saw consensus
about, and certainly something that the discussion (at least AFAICT) didn't
have a resolution for.

I think that the change is likely to decrease compressor efficiency, and it
*certainly* will for any algorithm which searches the state context first
to see if there are any exact matches.

In order to know which index one must use, one must scan the entire table
(didn't used to be the case with the reference set, but, that is gone) and
look for matches for each header. If one scans the static table first, then
one very likely wastes CPU. One can optimize a bit by only dong so if the
length of the value is < max_static_table_value_length

The new approach affords increased efficiency to the first request, at the
cost of decreased efficiency to any subsequent request.

I'll revert it as soon as possible.

-=R
On Aug 1, 2014 4:02 PM, "Jeff Pinner" <jpinner@twitter.com> wrote:

> Greg,
>
> I run a reverse proxy that handles things like authentication,
> rate-limiting, etc. for the backend servers I talk to.
>
> This means lots of custom headers that annotate requests with that
> information.
>
> There are also lots of public APIs that use custom headers and these
> have to all be proxied using names not in the header table. Here's a
> google example:
>
>
> https://developers.google.com/youtube/2.0/developers_guide_protocol_resumable_uploads
>
> So the TL;DR is that in browser use cases, yes most of the headers
> sent will be in the static table. But for APIs, internal requests,
> etc. the majority of headers will not be :(
>
> - Jeff
>
> On Fri, Aug 1, 2014 at 3:39 PM, Greg Wilkins <gregw@intalio.com> wrote:
> >
> > On 2 August 2014 03:37, Jeff Pinner <jpinner@twitter.com> wrote:
> >>
> >> Must have missed the connection between removing the "reference set"
> >> and switching the table order.
> >>
> >> I am happy to show data on how it is worse, specifically encoding
> >> header names indices is now 200% worse ;)
> >
> >
> >
> > Jeff,
> >
> > I used the test data set and used header table sizes from 0 to 16KB.
> >
> > There are 126 indexes that can be sent as a single byte, so the 61 static
> > entries take only half of those.   You have to have more than 65 indexed
> > headers before any two byte name indexes will be used.... and then it
> still
> > has to be a name that has not been used for one of those 65 custom
> entries.
> >
> > Sure you can craft a data set that does end up using a lot of 2 byte name
> > indexes, but I'd be amazed if it was from anything approaching normal
> > traffic.
> >
> > If there are are larger normal traffic data sets available, then I'm
> happy
> > to run the numbers again.
> >
> > cheers
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > --
> > Greg Wilkins <gregw@intalio.com>
> > http://eclipse.org/jetty HTTP, SPDY, Websocket server and client that
> scales
> > http://www.webtide.com  advice and support for jetty and cometd.
>
>

Received on Saturday, 2 August 2014 00:54:14 UTC