- From: James M Snell <jasnell@gmail.com>
- Date: Fri, 23 Aug 2013 14:36:36 -0700
- To: Mike Bishop <Michael.Bishop@microsoft.com>
- Cc: ietf-http-wg@w3.org
- Message-ID: <CABP7RbdmKqffgQSxdzZBOqNJXe0EzMZzQehNFiZuzcAJpTLRug@mail.gmail.com>
Yes, cut n paste error on my part. On Aug 23, 2013 2:32 PM, "Mike Bishop" <Michael.Bishop@microsoft.com> wrote: > Integer and Timestamp are defined with the same leading three bits as > well. I assume that's a typo. > > -----Original Message----- > From: James M Snell [mailto:jasnell@gmail.com] > Sent: Friday, August 23, 2013 9:00 AM > To: ietf-http-wg@w3.org > Subject: Type codecs within hpack > > With the assumption that hpack is what we'll ultimately end up sticking > with for header encoding, I wanted to take a moment to illustrate how the > binary type codecs would look within the hpack encoding... (note, this > email is **NOT** discussing the alternative header compression I describe > in my separate I-D.. this is talking about applying the value type > encodings to hpack). > > It would be very helpful if the various implementers would give some kind > of indication about whether they'd be willing to implement these encodings > and, if so, when. > > First, a quick note: The type codecs would be defined independently of the > compression mechanism itself... that is, just as Roberto's been wanting, we > can separate header value type details out of hpack entirely so that hpack > can deal specifically with header compression. > > Specifically, hpack would be updated such that it would not define a > specific encoding for header field values. As far as the basic hpack > mechanism is concerned, the value encoding would be an opaque octet > sequence. > > Literal Header without Indexing (Indexed Name): > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 1 | 1 | Index (5+) | > +---+---+---+-------------------+ > | Value Encoding (8+) | > +-------------------------------+ > > Literal Header without Indexing (New Name): > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 1 | 1 | 0 | > +---+---+---+-------------------+ > | Name Length (8+) | > +-------------------------------+ > | Name String (Length octets) | > +-------------------------------+ > | Value Encoding (8+) | > +-------------------------------+ > > Literal Header with Incremental Indexing (Indexed Name): > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 1 | 0 | Index (5+) | > +---+---+---+-------------------+ > | Value Encoding (8+) | > +-------------------------------+ > > Literal Header with Incremental Indexing (New Name): > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 1 | 0 | 0 | > +---+---+---+-------------------+ > | Name Length (8+) | > +-------------------------------+ > | Name String (Length octets) | > +-------------------------------+ > | Value Encoding (8+) | > +-------------------------------+ > > Literal Header with Substitution Indexing (Indexed Name): > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 0 | Index (6+) | > +---+---+-----------------------+ > | Substituted Index (8+) | > +-------------------------------+ > | Value Encoding (8+) | > +-------------------------------+ > > Literal Header with Substitution Indexing (New Name): > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 0 | 0 | > +---+---+-----------------------+ > | Name Length (8+) | > +-------------------------------+ > | Name String (Length octets) | > +-------------------------------+ > | Substituted Index (8+) | > +-------------------------------+ > | Value Encoding (8+) | > +-------------------------------+ > > > There are five possible value encodings which manifest two basic > patterns... The value types are: > > 1. UTF-8 > 2. Legacy > 3. Opaque > 4. Integer > 5. Timestamp > > On the wire these look like: > > UTF-8 > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 0 | 0 | Value Length (5+) | > +---+---+---+-------------------+ > | Value String (Length Octets) | > +-------------------------------+ > > Integer > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 0 | 1 | Value (5+) | > +---+---+---+---+---+---+---+---+ > > Timestamp > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 0 | 1 | Value (5+) | > +---+---+---+---+---+---+---+---+ > > Legacy > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 1 | 0 | 0 | Value Length (5+) | > +-------------------------------+ > | Value String (Length Octets) | > +-------------------------------+ > > Opaque > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 1 | 1 | 1 | Value Length (5+) | > +-------------------------------+ > | Value Octets (Length Octets) | > +-------------------------------+ > > > Notice that the UTF-8, Legacy and Opaque type encodings are identical > other than the leading three-bits. Likewise, the Integer and Timestamp > encodings are identical other than the leading three bits. > > For UTF-8, Legacy and Opaque, the Value Length is encoding as an integer > with a 5-bit prefix. > > For Integer, the Value is encoded as an integer with a 5-bit prefix. > Negative or fractional values cannot be represented. Theoretically there > is no upper value limit to this encoding, however, it would likely be good > to recommend that only values up to 64-bits are encoded. > > For Timestamp, the number of milliseconds since the epoch is encoded as an > integer with a 5-bit prefix. Dates before the epoch cannot be represented. > > If we put these types together with hpack, this is what we end up with: > > Literal Header without Indexing (Indexed Name), UTF-8 Value > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 1 | 1 | Index (5+) | > +---+---+---+---+---+---+---+---+ > | 0 | 0 | 0 | Value Length (5+) | > +---+---+---+-------------------+ > | Value String (Length Octets) | > +-------------------------------+ > > Literal Header without Indexing (Indexed Name), Integer Value > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 1 | 1 | Index (5+) | > +---+---+---+---+---+---+---+---+ > | 0 | 0 | 1 | Value (5+) | > +---+---+---+---+---+---+---+---+ > > Literal Header without Indexing (Indexed Name), Timestamp Value > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 1 | 1 | Index (5+) | > +---+---+---+---+---+---+---+---+ > | 0 | 1 | 0 | Value (5+) | > +---+---+---+---+---+---+---+---+ > > Literal Header without Indexing (Indexed Name), Legacy Value > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 1 | 1 | Index (5+) | > +---+---+---+---+---+---+---+---+ > | 1 | 0 | 0 | Value Length (5+) | > +---+---+---+-------------------+ > | Value String (Length Octets) | > +-------------------------------+ > > Literal Header without Indexing (Indexed Name), Opaque Value > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 1 | 1 | Index (5+) | > +---+---+---+---+---+---+---+---+ > | 1 | 1 | 1 | Value Length (5+) | > +---+---+---+-------------------+ > | Value Octets (Length Octets) | > +-------------------------------+ > > Literal Header without Indexing (New Name), UTF-8 Value > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 1 | 1 | 0 | > +---+---+---+-------------------+ > | Name Length (8+) | > +-------------------------------+ > | Name String (Length octets) | > +---+---+---+---+---+---+---+---+ > | 0 | 0 | 0 | Value Length (5+) | > +---+---+---+-------------------+ > | Value String (Length Octets) | > +-------------------------------+ > > Literal Header without Indexing (New Name), Integer Value > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 1 | 1 | 0 | > +---+---+---+-------------------+ > | Name Length (8+) | > +-------------------------------+ > | Name String (Length octets) | > +---+---+---+---+---+---+---+---+ > | 0 | 0 | 1 | Value (5+) | > +---+---+---+---+---+---+---+---+ > > Literal Header without Indexing (New Name), Timestamp Value > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 1 | 1 | 0 | > +---+---+---+-------------------+ > | Name Length (8+) | > +-------------------------------+ > | Name String (Length octets) | > +---+---+---+---+---+---+---+---+ > | 0 | 1 | 0 | Value (5+) | > +---+---+---+---+---+---+---+---+ > > Literal Header without Indexing (New Name), Legacy Value > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 1 | 1 | 0 | > +---+---+---+-------------------+ > | Name Length (8+) | > +-------------------------------+ > | Name String (Length octets) | > +---+---+---+---+---+---+---+---+ > | 1 | 0 | 0 | Value Length (5+) | > +---+---+---+-------------------+ > | Value String (Length Octets) | > +-------------------------------+ > > Literal Header without Indexing (New Name), Opaque Value > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 1 | 1 | 0 | > +---+---+---+-------------------+ > | Name Length (8+) | > +-------------------------------+ > | Name String (Length octets) | > +---+---+---+---+---+---+---+---+ > | 1 | 1 | 1 | Value Length (5+) | > +---+---+---+-------------------+ > | Value Octets (Length Octets) | > +-------------------------------+ > > Literal Header with Incremental Indexing (Indexed Name), UTF-8 Value > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 1 | 0 | Index (5+) | > +---+---+---+---+---+---+---+---+ > | 0 | 0 | 0 | Value Length (5+) | > +---+---+---+-------------------+ > | Value String (Length Octets) | > +-------------------------------+ > > Literal Header with Incremental Indexing (Indexed Name), Integer Value > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 1 | 0 | Index (5+) | > +---+---+---+---+---+---+---+---+ > | 0 | 0 | 1 | Value (5+) | > +---+---+---+---+---+---+---+---+ > > Literal Header with Incremental Indexing (Indexed Name), Timestamp Value > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 1 | 0 | Index (5+) | > +---+---+---+---+---+---+---+---+ > | 0 | 1 | 0 | Value (5+) | > +---+---+---+---+---+---+---+---+ > > Literal Header with Incremental Indexing (Indexed Name), Legacy Value > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 1 | 0 | Index (5+) | > +---+---+---+---+---+---+---+---+ > | 1 | 0 | 0 | Value Length (5+) | > +---+---+---+-------------------+ > | Value String (Length Octets) | > +-------------------------------+ > > Literal Header with Incremental Indexing (Indexed Name), Opaque Value > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 1 | 0 | Index (5+) | > +---+---+---+---+---+---+---+---+ > | 1 | 1 | 1 | Value Length (5+) | > +---+---+---+-------------------+ > | Value Octets (Length Octets) | > +-------------------------------+ > > Literal Header with Incremental Indexing (New Name), UTF-8 Value > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 1 | 0 | 0 | > +---+---+---+-------------------+ > | Name Length (8+) | > +-------------------------------+ > | Name String (Length octets) | > +---+---+---+---+---+---+---+---+ > | 0 | 0 | 0 | Value Length (5+) | > +---+---+---+-------------------+ > | Value String (Length Octets) | > +-------------------------------+ > > Literal Header with Incremental Indexing (New Name), Integer Value > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 1 | 0 | 0 | > +---+---+---+-------------------+ > | Name Length (8+) | > +-------------------------------+ > | Name String (Length octets) | > +---+---+---+---+---+---+---+---+ > | 0 | 0 | 1 | Value (5+) | > +---+---+---+---+---+---+---+---+ > > Literal Header with Incremental Indexing (New Name), Timestamp Value > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 1 | 0 | 0 | > +---+---+---+-------------------+ > | Name Length (8+) | > +-------------------------------+ > | Name String (Length octets) | > +---+---+---+---+---+---+---+---+ > | 0 | 1 | 0 | Value (5+) | > +---+---+---+---+---+---+---+---+ > > Literal Header with Incremental Indexing (New Name), Legacy Value > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 1 | 0 | 0 | > +---+---+---+-------------------+ > | Name Length (8+) | > +-------------------------------+ > | Name String (Length octets) | > +---+---+---+---+---+---+---+---+ > | 1 | 0 | 0 | Value Length (5+) | > +---+---+---+-------------------+ > | Value String (Length Octets) | > +-------------------------------+ > > > Literal Header with Incremental Indexing (Indexed Name), Opaque Value > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 1 | 0 | 0 | > +---+---+---+-------------------+ > | Name Length (8+) | > +-------------------------------+ > | Name String (Length octets) | > +---+---+---+---+---+---+---+---+ > | 1 | 1 | 1 | Value Length (5+) | > +---+---+---+-------------------+ > | Value Octets (Length Octets) | > +-------------------------------+ > > > Literal Header with Substitution Indexing (Indexed Name), UTF-8 Value > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 0 | Index (6+) | > +---+---+-----------------------+ > | Substituted Index (8+) | > +---+---+---+---+---+---+---+---+ > | 0 | 0 | 0 | Value Length (5+) | > +---+---+---+-------------------+ > | Value String (Length Octets) | > +-------------------------------+ > > Literal Header with Substitution Indexing (Indexed Name), Integer Value > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 0 | Index (6+) | > +---+---+-----------------------+ > | Substituted Index (8+) | > +---+---+---+---+---+---+---+---+ > | 0 | 0 | 1 | Value (5+) | > +---+---+---+---+---+---+---+---+ > > Literal Header with Substitution Indexing (Indexed Name), Timestamp Value > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 0 | Index (6+) | > +---+---+-----------------------+ > | Substituted Index (8+) | > +---+---+---+---+---+---+---+---+ > | 0 | 1 | 0 | Value (5+) | > +---+---+---+---+---+---+---+---+ > > Literal Header with Substitution Indexing (Indexed Name), Legacy Value > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 0 | Index (6+) | > +---+---+-----------------------+ > | Substituted Index (8+) | > +---+---+---+---+---+---+---+---+ > | 1 | 0 | 0 | Value Length (5+) | > +---+---+---+-------------------+ > | Value String (Length Octets) | > +-------------------------------+ > > Literal Header with Substitution Indexing (Indexed Name), Opaque Value > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 0 | Index (6+) | > +---+---+-----------------------+ > | Substituted Index (8+) | > +---+---+---+---+---+---+---+---+ > | 1 | 1 | 1 | Value Length (5+) | > +---+---+---+-------------------+ > | Value Octets (Length Octets) | > +-------------------------------+ > > > Literal Header with Substitution Indexing (New Name), UTF-8 Value > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 0 | 0 | > +---+---+---+-------------------+ > | Name Length (8+) | > +-------------------------------+ > | Name String (Length octets) | > +---+---+-----------------------+ > | Substituted Index (8+) | > +---+---+---+---+---+---+---+---+ > | 0 | 0 | 0 | Value Length (5+) | > +---+---+---+-------------------+ > | Value String (Length Octets) | > +-------------------------------+ > > Literal Header with Substitution Indexing (New Name), Integer Value > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 0 | 0 | > +---+---+---+-------------------+ > | Name Length (8+) | > +-------------------------------+ > | Name String (Length octets) | > +---+---+-----------------------+ > | Substituted Index (8+) | > +---+---+---+---+---+---+---+---+ > | 0 | 0 | 1 | Value (5+) | > +---+---+---+---+---+---+---+---+ > > Literal Header with Substitution Indexing (New Name), Timestamp Value > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 0 | 0 | > +---+---+---+-------------------+ > | Name Length (8+) | > +-------------------------------+ > | Name String (Length octets) | > +---+---+-----------------------+ > | Substituted Index (8+) | > +---+---+---+---+---+---+---+---+ > | 0 | 1 | 0 | Value (5+) | > +---+---+---+---+---+---+---+---+ > > Literal Header with Substitution Indexing (New Name), Legacy Value > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 0 | 0 | > +---+---+---+-------------------+ > | Name Length (8+) | > +-------------------------------+ > | Name String (Length octets) | > +---+---+-----------------------+ > | Substituted Index (8+) | > +---+---+---+---+---+---+---+---+ > | 1 | 0 | 0 | Value Length (5+) | > +---+---+---+-------------------+ > | Value String (Length Octets) | > +-------------------------------+ > > Literal Header with Substitution Indexing (New Name), Opaque Value > > 0 1 2 3 4 5 6 7 > +---+---+---+---+---+---+---+---+ > | 0 | 0 | 0 | > +---+---+---+-------------------+ > | Name Length (8+) | > +-------------------------------+ > | Name String (Length octets) | > +---+---+-----------------------+ > | Substituted Index (8+) | > +---+---+---+---+---+---+---+---+ > | 1 | 1 | 1 | Value Length (5+) | > +---+---+---+-------------------+ > | Value Octets (Length Octets) | > +-------------------------------+ > > For backwards compatibility with HTTP/1, the following conversion rules > would apply when translating an HTTP/2 request into HTTP/1 > > 1. UTF-8 value encodings is converted to ASCII, with all non-ASCII > codepoints pct-encoded. > 2. Legacy value encodings are passed through without any translation 3. > Opaque values are Base64 encoded 4. Integer values are converted into the > ASCII representation of the number (e.g. 1234 = "1234") 5. Timestamp values > are converted to the HTTP-date equivalent string, with millisecond > precision lost. > > There is no normative translation from HTTP/1 to HTTP/2. All HTTP/1 header > field values would be passed without translation using the Legacy Value > Encoding. > > Header fields would have to be specifically defined at the http layer to > use the new value encodings. The definitions of existing HTTP/1 header > fields remain the same unless those headers are specifically redefined. > > Some of the existing standard headers that we know can be redefined > include: > > * :status (integer) > * :host (legacy or utf-8) > * :path (legacy or utf-8) > * content-length (legacy or integer) > * date (legacy or timestamp) > * max-forwards (legacy or integer) > * retry-after (legacy, integer or timestamp) > * if-modified-since (legacy or timestamp) > * if-unmodified-since (legacy or timestamp) > * last-modified (legacy or timestamp) > * age (legacy or integer) > * expires (legacy, integer or timestamp) > * etag (legacy or opaque... an opaque etag is always "strong", never > "weak") > > There is a question about whether we really need to have any distinction > between UTF-8, Legacy and Opaque. First, it's important to note that, with > the exception of the first three bits, these three types serialize exactly > the same on the wire and are handled in exactly the same way by the header > compression mechanism. The type distinction comes into play only when we > want to use the value. > > Allowing for UTF-8 gives developers the ability to use extended > characters.. for instance, we could use IRI's directly within our requests > without any translation to ASCII URI's. That means, passing :host with an > IDN rather than punycode equivalent; and passing :path with extended > characters without being required to do additional pct-encoding. > > Allowing for Legacy gives us a fallback for HTTP/1 compatibility. > There are very few enforced rules in HTTP/1 for values and we need to have > a way of indicating that the value being passed through is untouched > relative to HTTP/1 and follows all the same rules at HTTP/1. > > Allowing for Opaque values gives us the opportunity to achieve highly > efficient encodings for things that, in the past, haven't been encoded very > efficiently. It allows us to pass raw binary sequences that ought not to be > interpreted as text necessarily.. avoiding the need to apply additional > base64 or hex encoding. > > Lastly, I know there have been some discussions around structured > alternatives for timestamp encoding. Honestly, I've played around with a > number of variations and kept coming to the same conclusion: > milliseconds since the epoch encoded as a variable length integer really > is good enough. It keeps things simple while delivering a feature many > appdevs have been asking for for a very long time. > >
Received on Friday, 23 August 2013 21:37:05 UTC