Re: Significantly reducing headers footprint

On Sun, Jun 10, 2012 at 4:17 PM, Willy Tarreau <w@1wt.eu> wrote:

> Hi,
>
> I recently managed to collect requests from some enterprise proxies to
> experiment with binary encoding as described in our draft [1].
>
> After some experimentation and discussions with some people, I managed to
> get significant gains [2] which could still be improved.
>
> What's currently performed is the following :
>  - message framing
>  - binary encoding of the HTTP version (2 bits)
>  - binary encoding of the method (4 bits)
>  - move Host header to the URI
>  - encoding of the URI relative to the previous one
>  - binary encoding of each header field names (1 byte)
>  - encoding of each header relative to the previous one.
>  - binary encoding of the If-Modified-Since date
>
> The code achieving this is available at [2]. It's an ugly PoC but it's
> a useful experimentation tool for me, feel free to use it to experiment
> with your own implementations if you like.
>
> I'm already observing request compression ratios of 90-92% on various
> requests, including on a site with a huge page with large cookies and
> URIs ; 132 kB of requests were reduced to 10kB. In fact while the draft
> suggests use of multiple header contexts (connection, common and message),
> now I'm feeling like we don't need to store 3 contexts anymore, one single
> is enough if requests remain relative to previous one.
>

For my deployment, I'm fairly certain this would not be all that common.
Two contexts may be enough 'connection' and 'common', but I think you had
it right the first time.
The more clients you have and are aggregating through to elsewhere, to more
advantageous that scheme becomes.


>
> But I think that by typing a bit more the protocol, we could improve even
> further and at the same time improve interoperability. Among the things
> I am observing which still take some space in the page load of an online
> newspaper (127 objects, data were anonymized) :
>
>  - User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; fr;
> rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12
>    => Well, this one is only sent once over the connection, but we could
>       reduce this further by using a registery of known vendors/products
>       and incite vendors to emit just a few bytes (vendor/product/version).
>
>  - Accept: text/css,*/*;q=0.1
>    => this one changes depending on what object the browser requests, so it
>       is less efficiently compressed :
>
>        1 Accept:
> text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
>        4 Accept: text/css,*/*;q=0.1
>        8 Accept: */*
>        1 Accept: image/png,image/*;q=0.8,*/*;q=0.5
>        2 Accept: */*
>        9 Accept: image/png,image/*;q=0.8,*/*;q=0.5
>        2 Accept:
> text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
>       90 Accept: image/png,image/*;q=0.8,*/*;q=0.5
>        1 Accept: */*
>        9 Accept: image/png,image/*;q=0.8,*/*;q=0.5
>
>    => With better request reordering, we could have this :
>
>       11 Accept: */*
>      109 Accept: image/png,image/*;q=0.8,*/*;q=0.5
>        4 Accept: text/css,*/*;q=0.1
>        3 Accept:
> text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
>

Achieving this seems difficult? How would we get a reording to occur in a
reasonable manner?


>
>    I'm already wondering if we have *that* many content-types and if we
> need
>    to use long words such as "application" everywhere.
>

We were quite wordy in the past :)


>
>  - Accept-Language: fr,fr-fr;q=0.8,en-us;q=0.5,en;q=0.3
>    Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
>    Accept-Encoding: gzip,deflate
>
>    => Same comment as above concerning the number of possible values.
> However
>       these ones were all sent identical so the gain is more for the remote
>       parser than for the upstream link.
>
>  - Referer: http://www.example.com/
>    => referrers do compress quite well relative to each other. Still there
>       are many blogs and newspapers on the net today with very large URLs,
>       and their URLs cause very large referrers to be sent along with each
>       object composing the page. At least a better ordering of the requests
>       saves a few more hundred bytes for the whole page. In the end I only
>       got 4 different values :
>       http://www.example.com/
>
> http://www.example.com/sites/news/files/css/css_RWicSr_h9UxCJrAbE57UbNf_oNYhtaF5YghFXJemVNQ.css
>
> http://www.example.com/sites/news/files/css/css_lKoFARDAyB20ibb5wNG8nMhflDNNW_Nb9DsNprYt8mk.css
>
> http://www.example.com/sites/news/files/css/css_qSyFGRLc-tslOV1oF9GCzEe1eGDn4PP7vOM1HGymNYU.css
>
>    Among the improvements I'm thinking about, we could decide to use
> relative
>    URIs when the site is the same. I don't know either if it's of any use
> on
>    the server side to know that the request was emitted for a specific CSS.
>
>  - If-Modified-Since: Fri, 27 Apr 2012 14:41:31 GMT
>    => I have encoded this one on 32 and 64 bits and immediately saved 3.1
> and
>       2.6 kB respectively. Well, storing 4 more bytes per request might be
>       wasted considering that we probably don't need a nanosecond
> resolution
>       for 585 years. But 40-48 bits might be fine.
>
>  - Cache-Control: max-age=0
>    => I suspect the user hit the Refresh button, this was present in about
>       half the requests. Anyway, this raises the question of the length it
>       requires for something which is just a boolean here ("ignore cache").
>       Probably that a client has very few Cache-Control header values to
>       send, and that reducing this to a smaller set would be beneficial.
>
>  - If-None-Match: "3013140661"
>    => I guess there is nothing we can do on this one, except suggest that
>       implementors use more bits and less bytes to emit their etags.
>
>  - Cookie: xtvrn=$OaiJty$; xtan327981=c; xtant327981=c; has_js=c;
> __utma=KBjWnx24Q.7qFKqmB7v.i0JDH91L_R.0kU2W1uL49.JM4KtFLV0b.C;
> __utmc=Rae9ZgQHz;
> __utmz=NRSZOcCWV.d5MlK5RJsi.-.f.N8J73w=S1SLuT_j0m.O8|VsIxwE=(jHw58obb)|r9SgsT=WQfZe8jr|pFSZGH=/@/qwDyMw3I;
> __gads=td=ASP_D5ml4Ebevrej:R=pvxltafqZK:x=E4FUn3YiNldW3rhxzX6YlCptZp8zF-b5qc;
> _chartbeat2=oQvb8k_G9tduhauf.LqOukjnlaaE7K.uDBaR79E1WT4t.Kr9L_lIrOtruE8;
> __qca=LC9oiRpFSWShYlxUtD37GJ2k8AL; __utmb=vG8UMEjrz.Qf.At.pXD61lUeHZ;
> pm8196_1=c; pm8194_1=c
>
>    => amazingly, this one compresses extremely well with the above scheme,
>       because additions are performed at the end so consecutive cookies
> keep
>       a lot in common, and changes are not too frequent. However, given the
>       omnipresent usage of cookies, I was wondering why we should not
> create
>       a new entity of its own for the cookies instead of abusing the Cookie
>       header. It would make it a lot easier for both ends to find what they
>       need. For instance, a load balancer just needs to find a server name
>       in the thing above. What a waste of on-wire bits and of CPU cycles !
>

You're suggesting breaking the above into smaller, addressable bits?


>
> BTW, binary encoding would probably also help addressing a request I often
> hear in banking environments : the need to sign/encrypt/compress only
> certain
> headers or cookies. Right now when people do this, they have to
> base64-encode
> the result, which is another transformation at both ends and inflates the
> data. If we make provisions in the protocol for announcing encrypted or
> compressed headers using 2-3 bits, it might become more usable. I'm not
> convinced it provides any benefit between a browser and an origin server
> though. So maybe it will remain application-specific and the transport
> just has to make it easier to emit 8-bit data in header field values.
>
>

> Has anyone any opinion on the subject above ? Or ideas about other things
> that terribly clobber the upstream pipe and that should be fixed in 2.0 ?
>

I like binary framing because it is significantly easier to get right and
works well when we're considering things other than just plain HTTP.
Token-based parsing is quite annoying in comparison-- it either requires
significant implementation complexity to minimize memory. With length-based
framing, the implementation complexity is decreased arguably for everyone
and certainly in cases where you wish to be efficient with buffers.

-=R


> I hope I'll soon find some time to update our draft to reflect recent
> updates
> and findings.
>
> Regards,
> Willy
>
> --
> [1] http://tools.ietf.org/id/draft-tarreau-httpbis-network-friendly-00.txt
> [2] http://1wt.eu/http2/
>
>
>

Received on Sunday, 10 June 2012 23:40:07 UTC