- From: James M Snell <jasnell@gmail.com>
- Date: Wed, 10 Jul 2013 09:05:53 -0700
- To: Michael Sweet <msweet@apple.com>
- Cc: "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
On Wed, Jul 10, 2013 at 6:09 AM, Michael Sweet <msweet@apple.com> wrote: > James, > [snip] > > Actually, I think the typed encoding will probably yield enough savings to offset any increase in size, for example the date header going from 29 octets to ~6 in the variable integer encoding (or 8/11 for the RFC 2579 encoding - see below). > The type codecs definitely save quite a bit with highly variable header fields that tend to get sent as literals often (date, last-modified, content-length, etc). For fairly static header fields, the difference is minimal for any individual header. Where you start to see the gap widen is over long running connections, where the average number of bits on the wire trends higher. It's worth it, however, IMHO. > I like the 256-entry single header table approach. Of course, I have some feedback... :) > > 1. Would be nice if you could just include the Unsigned Variable Length Integer Syntax section from the other header compression draft wholesale so this draft stands on its own. Add a notice at the beginning "(This is copied from draft-ietf-httpbis-header-compression-NN)" so people know it is the same encoding. Then the reference to it becomes informative. > > 2. Would also be nice to use the same figure format as the compression and http2 drafts... (see below) > Will do both in the next iteration. > 3. Representing timestamps as milliseconds since the traditional UNIX epoch is problematic since it requires support for large integers (at least 42 bits to get us to the traditional 2038 end year, more if you want to keep going past then...) and AFAIK isn't widely used in standards for actual representation of a date/time. RFC 2579 defines a DateAndTime format that is 8 (UTC) or 11 (local time) octets long and is easy to map to/from typical OS APIs without the use of large integers. Granted, it doesn't give you more than 10ths of seconds, but I think that should be enough for HTTP. (We use this format in IPP - I'd rename "Timestamp" to "DateAndTime" if you decide to make this change...) > Millisecond precision has been on the HTTP wish-lists of many application developers for a very long time, including mine and I believe the additional requirements are worth it. That said, I've been considering an alternative approach that is based on a single byte era prefix. This would encode the timestamp into two parts, a 8-bit prefix followed by a uvarint <= (2^32)-1. For now, I just picked a format that would work, with the intent of revisiting it once typed codecs come up for formal discussion after the august interop event. > 4. I'm not super-keen on grouping the headers into 4 bins ahead of time, since that increases encoder storage requirements. Also, there isn't a way to just replace the value for an existing indexed header in your current draft. Perhaps a hybrid approach where the indexed representation can have 1-to-64 indexes and the others encode a single name/value? Something like this: > A small amount of buffering is required with the grouping approach, but the encoder controls how much. An encoder that chooses less buffering would see a bit more encoding overhead. An encoder could choose to encode each header individually, without grouping, at the cost of one additional octet per header. The Indexed Literal Replacement can be used to replace just the value of an existing header... for instance, suppose I have an existing entry at position #1 with name="foo", value=1. I want to keep the same name but replace the value with 2, I would send: C0 01 20 01 02 The first octet identifies this as an Indexed Literal Replacement group with one item. The second octet identifies the Index position being replaced The third specifies that the value is an Integer, with the five-least significant bits set to zero, indicating that a name index reference is provided by the fourth octet The fourth octet is the name index reference. We're pointing the index #01 (the same index that's being replaced) The fifth octet provides the new value. Because the name resolution is done before the replacement, the name is reused and just the value is replaced. - James
Received on Wednesday, 10 July 2013 16:06:41 UTC