- From: Poul-Henning Kamp <phk@phk.freebsd.dk>
- Date: Mon, 07 Jul 2014 12:33:51 +0000
- To: ietf-http-wg@w3.org
I have looked at HPACK and I'm pretty certain that we will want to do better, if not now, then later[1]. As far as I can see, no space has been set aside for versioning HPACK nor for using entirely different algorithms for compression ? Obviously, a SETTINGS_ will be needed to tell what is supported above or beyond HPACK, an IANA registry for same etc. etc. But there will need to be an indication in each HEADERS frame of what algorithm/version of compression is used. I don't want to spend time on SETTINGS and IANA registry at this point, that can wait till it become srelevant, for now I would be happy if we can just reserve a field somewhere, demand its value be zero, and note that it is for future expansion of header compression. Can we please append a byte to the HEADERS frame for this purpose ? Poul-Henning [1] For background, here are some random notes I took some times back about domain/header specific compression: For instance, both "User-Agent" and "Server" consists of largely well known keywords, and using tailored dictionaries can save a lot. There is a risk here of tailoring *too* much, we don't want to make people think twice about correctly declaring new version numbers for instance, but mutadis mutandis, there is a lot to gain. If nothing else just making 0x80 mean "compatible" would shave 9 bytes of pretty much all User-Agent headers, and with 0x81 meaning "Mozilla", 0x82 "Win32" etc, it adds up really fast. Such a "tokenpression" could be applied as s preprocessor before a general purpose compressor such as HPACK or it could be done by expanding the vocabulary of HPACKS default dictionary. (The worries about variable size attacks does not seem relevant in this specific case). Likewise, the Date header can be compressed to 4 bytes in a domainspecific (time_t) way, speeding up processing for sneaky implemenations at the same time. Set-Cookie, Last-Modified and various other headers also contain dates which could be similarly compressed. A very large percentage of Cookie/Set-Cookie headers can be compressed by scanning the "value" for characterset and decoding/encoding well-known-ascii-representations, so that for instance: Set-Cookie: foobar="0123456789abcdef"; [...] becomes: foobar= <HEX> len=8 0x01 0x23 0x45 0x67 0x89 0xab 0xcd 0xef Likewise "path=/" is so universal that we can make it tacit and only transmit anything if it is not there or has another path. These are the "big" ones, but there are other standard headers which show obvious potential, Content-Type:, Vary:, Content-Length:, Age: and so on. Some of these compressions would be CPU neutral and some, like dates could even save CPU in many cases, all have the potential to save memory and relevant bandwidth. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence.
Received on Monday, 7 July 2014 12:34:15 UTC