Re: FYI... Binary Optimized Header Encoding for SPDY

On Aug 1, 2012, at 8:30 PM, Mike Belshe wrote:

> A couple of thoughts: 
> 
> * Thanks for writing up!
> 
> * I don't think we need utf-8 encoded headers.  Not sure how you'd pass them off to HTTP anyway?
> 
> * The codepages seem like complexity, but I'm not sure the benefit.  I would remove them.
> 
> * I would remove the flags too - per header flags - do we really need it?  I'd remove it without a very clear use case.
> 
> * I know that 32bits seems like a lot.  Defining length fields has two routes:  fixed length or variable length.  I like the fixed length because I believe they are simpler.  However, the price of that simplicity is that you've got limits.  Everyone hates limits :-)  In your proposal you whacked the number of headers to 8 bits, or 256 headers.   While I agree this is an edge, I don't see a reason why it should be against the rules to have more.  Same for the length of a header value - you've used 16 bits (64KB).  While this seems massive by today's standards, in 10 years maybe 1MB cookies are the norm.  I don't know, but I'd hate to have the limit.  So.... this leaves us thinking that maybe we should use variable length encoding.  Personally, I think the fixed length simplicity is worth it.  But this is subjective, of course.   Just use 32bits everywhere - it works well and you won't notice the perf difference at all (I measured :-)

Hi

I agree. I haven't written a draft like James, but I would like to suggest an alternative encoding, that I think has the advantage of being easier to parse:

Every header begins with a 48-bit descriptor (I think 32-bit would be enough with a 16-bit length/value, but this is safer)

  L0cRiiiiiiiiiiiiVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV
or
  L1cRiiiiiiiiiiiillllllllllllllllllllllllllllllll

The first bit is the "Last" bit - 1 for the last header and 0 for all the previous ones
The second bit is 0 for TV headers (type-value) and 1 for TLV headers (type-length-value)
The third bit is 1 for critical (do not parse the request if you don't understand this) or zero for not
The fourth bit is reserved.

The next 12 bits are the type of header. There are actually 2^13 possible headers, because the same type can be used for both a TV and a TLV header.

For TV, the next 32 bits hold the actual value - a number from 0 to 4,294,967,295 - should be enough for lots of headers, including those denoting time
For TLV, the next 32 bits hold the length, and are followed by the value.

Using your registries from section 1.4 as an example, we get this:

20 01 00 02 00 00 (version=2.0)
20 02 00 00 00 01 (method=GET)
60 04 00 00 00 01 2F (path="/")
C0 03 00 00 00 0F 77 77 77 2E 65 78 61 6d 70 6C 65 2E 63 6F 6D (host="www.example.com")

That's 40 bytes. The HTTP/1.1 version is 39 bytes:

GET / HTTP/1.1
Host: www.example.com

Of course, you save more on headers that have longer names, like "strict-transport-security" or "Original-Encoded-Information-Types" or "Downgraded-Disposition-Notification-To"
You also save a lot on those that can be TV, and the value doesn't have to be numeric. It could be a bit string of whatever, as long as it fits in 32 bits.

Yoav

Received on Thursday, 2 August 2012 04:25:35 UTC