Type codecs within hpack

With the assumption that hpack is what we'll ultimately end up
sticking with for header encoding, I wanted to take a moment to
illustrate how the binary type codecs would look within the hpack
encoding... (note, this email is **NOT** discussing the alternative
header compression I describe in my separate I-D.. this is talking
about applying the value type encodings to hpack).

It would be very helpful if the various implementers would give some
kind of indication about whether they'd be willing to implement these
encodings and, if so, when.

First, a quick note: The type codecs would be defined independently of
the compression mechanism itself... that is, just as Roberto's been
wanting, we can separate header value type details out of hpack
entirely so that hpack can deal specifically with header compression.

Specifically, hpack would be updated such that it would not define a
specific encoding for header field values. As far as the basic hpack
mechanism is concerned, the value encoding would be an opaque octet
sequence.

Literal Header without Indexing (Indexed Name):

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 1 | 1 |    Index (5+)     |
   +---+---+---+-------------------+
   |      Value Encoding (8+)      |
   +-------------------------------+

Literal Header without Indexing (New Name):

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 1 | 1 |         0         |
   +---+---+---+-------------------+
   |       Name Length (8+)        |
   +-------------------------------+
   |  Name String (Length octets)  |
   +-------------------------------+
   |      Value Encoding (8+)      |
   +-------------------------------+

Literal Header with Incremental Indexing (Indexed Name):

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 1 | 0 |    Index (5+)     |
   +---+---+---+-------------------+
   |      Value Encoding (8+)      |
   +-------------------------------+

Literal Header with Incremental Indexing (New Name):

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 1 | 0 |         0         |
   +---+---+---+-------------------+
   |       Name Length (8+)        |
   +-------------------------------+
   |  Name String (Length octets)  |
   +-------------------------------+
   |      Value Encoding (8+)      |
   +-------------------------------+

Literal Header with Substitution Indexing (Indexed Name):

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 0 |      Index (6+)       |
   +---+---+-----------------------+
   |    Substituted Index (8+)     |
   +-------------------------------+
   |      Value Encoding (8+)      |
   +-------------------------------+

Literal Header with Substitution Indexing (New Name):

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 0 |           0           |
   +---+---+-----------------------+
   |       Name Length (8+)        |
   +-------------------------------+
   |  Name String (Length octets)  |
   +-------------------------------+
   |    Substituted Index (8+)     |
   +-------------------------------+
   |      Value Encoding (8+)      |
   +-------------------------------+


There are five possible value encodings which manifest two basic
patterns... The value types are:

1. UTF-8
2. Legacy
3. Opaque
4. Integer
5. Timestamp

On the wire these look like:

UTF-8
     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 0 | 0 | Value Length (5+) |
   +---+---+---+-------------------+
   | Value String (Length Octets)  |
   +-------------------------------+

Integer

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 0 | 1 |    Value (5+)     |
   +---+---+---+---+---+---+---+---+

Timestamp

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 0 | 1 |    Value (5+)     |
   +---+---+---+---+---+---+---+---+

Legacy
     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 1 | 0 | 0 | Value Length (5+) |
   +-------------------------------+
   | Value String (Length Octets)  |
   +-------------------------------+

Opaque
     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 1 | 1 | 1 | Value Length (5+) |
   +-------------------------------+
   | Value Octets (Length Octets)  |
   +-------------------------------+


Notice that the UTF-8, Legacy and Opaque type encodings are identical
other than the leading three-bits. Likewise, the Integer and Timestamp
encodings are identical other than the leading three bits.

For UTF-8, Legacy and Opaque, the Value Length is encoding as an
integer with a 5-bit prefix.

For Integer, the Value is encoded as an integer with a 5-bit prefix.
Negative or fractional values cannot be represented. Theoretically
there is no upper value limit to this encoding, however, it would
likely be good to recommend that only values up to 64-bits are
encoded.

For Timestamp, the number of milliseconds since the epoch is encoded
as an integer with a 5-bit prefix. Dates before the epoch cannot be
represented.

If we put these types together with hpack, this is what we end up with:

Literal Header without Indexing (Indexed Name), UTF-8 Value

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 1 | 1 |    Index (5+)     |
   +---+---+---+---+---+---+---+---+
   | 0 | 0 | 0 | Value Length (5+) |
   +---+---+---+-------------------+
   | Value String (Length Octets)  |
   +-------------------------------+

Literal Header without Indexing (Indexed Name), Integer Value

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 1 | 1 |    Index (5+)     |
   +---+---+---+---+---+---+---+---+
   | 0 | 0 | 1 |    Value (5+)     |
   +---+---+---+---+---+---+---+---+

Literal Header without Indexing (Indexed Name), Timestamp Value

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 1 | 1 |    Index (5+)     |
   +---+---+---+---+---+---+---+---+
   | 0 | 1 | 0 |    Value (5+)     |
   +---+---+---+---+---+---+---+---+

Literal Header without Indexing (Indexed Name), Legacy Value

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 1 | 1 |    Index (5+)     |
   +---+---+---+---+---+---+---+---+
   | 1 | 0 | 0 | Value Length (5+) |
   +---+---+---+-------------------+
   | Value String (Length Octets)  |
   +-------------------------------+

Literal Header without Indexing (Indexed Name), Opaque Value

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 1 | 1 |    Index (5+)     |
   +---+---+---+---+---+---+---+---+
   | 1 | 1 | 1 | Value Length (5+) |
   +---+---+---+-------------------+
   | Value Octets (Length Octets)  |
   +-------------------------------+

Literal Header without Indexing (New Name), UTF-8 Value

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 1 | 1 |         0         |
   +---+---+---+-------------------+
   |       Name Length (8+)        |
   +-------------------------------+
   |  Name String (Length octets)  |
   +---+---+---+---+---+---+---+---+
   | 0 | 0 | 0 | Value Length (5+) |
   +---+---+---+-------------------+
   | Value String (Length Octets)  |
   +-------------------------------+

Literal Header without Indexing (New Name), Integer Value

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 1 | 1 |         0         |
   +---+---+---+-------------------+
   |       Name Length (8+)        |
   +-------------------------------+
   |  Name String (Length octets)  |
   +---+---+---+---+---+---+---+---+
   | 0 | 0 | 1 |    Value (5+)     |
   +---+---+---+---+---+---+---+---+

Literal Header without Indexing (New Name), Timestamp Value

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 1 | 1 |         0         |
   +---+---+---+-------------------+
   |       Name Length (8+)        |
   +-------------------------------+
   |  Name String (Length octets)  |
   +---+---+---+---+---+---+---+---+
   | 0 | 1 | 0 |    Value (5+)     |
   +---+---+---+---+---+---+---+---+

Literal Header without Indexing (New Name), Legacy Value

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 1 | 1 |         0         |
   +---+---+---+-------------------+
   |       Name Length (8+)        |
   +-------------------------------+
   |  Name String (Length octets)  |
   +---+---+---+---+---+---+---+---+
   | 1 | 0 | 0 | Value Length (5+) |
   +---+---+---+-------------------+
   | Value String (Length Octets)  |
   +-------------------------------+

Literal Header without Indexing (New Name), Opaque Value

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 1 | 1 |         0         |
   +---+---+---+-------------------+
   |       Name Length (8+)        |
   +-------------------------------+
   |  Name String (Length octets)  |
   +---+---+---+---+---+---+---+---+
   | 1 | 1 | 1 | Value Length (5+) |
   +---+---+---+-------------------+
   | Value Octets (Length Octets)  |
   +-------------------------------+

Literal Header with Incremental Indexing (Indexed Name), UTF-8 Value

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 1 | 0 |    Index (5+)     |
   +---+---+---+---+---+---+---+---+
   | 0 | 0 | 0 | Value Length (5+) |
   +---+---+---+-------------------+
   | Value String (Length Octets)  |
   +-------------------------------+

Literal Header with Incremental Indexing (Indexed Name), Integer Value

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 1 | 0 |    Index (5+)     |
   +---+---+---+---+---+---+---+---+
   | 0 | 0 | 1 |    Value (5+)     |
   +---+---+---+---+---+---+---+---+

Literal Header with Incremental Indexing (Indexed Name), Timestamp Value

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 1 | 0 |    Index (5+)     |
   +---+---+---+---+---+---+---+---+
   | 0 | 1 | 0 |    Value (5+)     |
   +---+---+---+---+---+---+---+---+

Literal Header with Incremental Indexing (Indexed Name), Legacy Value

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 1 | 0 |    Index (5+)     |
   +---+---+---+---+---+---+---+---+
   | 1 | 0 | 0 | Value Length (5+) |
   +---+---+---+-------------------+
   | Value String (Length Octets)  |
   +-------------------------------+

Literal Header with Incremental Indexing (Indexed Name), Opaque Value

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 1 | 0 |    Index (5+)     |
   +---+---+---+---+---+---+---+---+
   | 1 | 1 | 1 | Value Length (5+) |
   +---+---+---+-------------------+
   | Value Octets (Length Octets)  |
   +-------------------------------+

Literal Header with Incremental Indexing (New Name), UTF-8 Value

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 1 | 0 |         0         |
   +---+---+---+-------------------+
   |       Name Length (8+)        |
   +-------------------------------+
   |  Name String (Length octets)  |
   +---+---+---+---+---+---+---+---+
   | 0 | 0 | 0 | Value Length (5+) |
   +---+---+---+-------------------+
   | Value String (Length Octets)  |
   +-------------------------------+

Literal Header with Incremental Indexing (New Name), Integer Value

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 1 | 0 |         0         |
   +---+---+---+-------------------+
   |       Name Length (8+)        |
   +-------------------------------+
   |  Name String (Length octets)  |
   +---+---+---+---+---+---+---+---+
   | 0 | 0 | 1 |    Value (5+)     |
   +---+---+---+---+---+---+---+---+

Literal Header with Incremental Indexing (New Name), Timestamp Value

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 1 | 0 |         0         |
   +---+---+---+-------------------+
   |       Name Length (8+)        |
   +-------------------------------+
   |  Name String (Length octets)  |
   +---+---+---+---+---+---+---+---+
   | 0 | 1 | 0 |    Value (5+)     |
   +---+---+---+---+---+---+---+---+

Literal Header with Incremental Indexing (New Name), Legacy Value

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 1 | 0 |         0         |
   +---+---+---+-------------------+
   |       Name Length (8+)        |
   +-------------------------------+
   |  Name String (Length octets)  |
   +---+---+---+---+---+---+---+---+
   | 1 | 0 | 0 | Value Length (5+) |
   +---+---+---+-------------------+
   | Value String (Length Octets)  |
   +-------------------------------+


Literal Header with Incremental Indexing (Indexed Name), Opaque Value

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 1 | 0 |         0         |
   +---+---+---+-------------------+
   |       Name Length (8+)        |
   +-------------------------------+
   |  Name String (Length octets)  |
   +---+---+---+---+---+---+---+---+
   | 1 | 1 | 1 | Value Length (5+) |
   +---+---+---+-------------------+
   | Value Octets (Length Octets)  |
   +-------------------------------+


Literal Header with Substitution Indexing (Indexed Name), UTF-8 Value

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 0 |      Index (6+)       |
   +---+---+-----------------------+
   |    Substituted Index (8+)     |
   +---+---+---+---+---+---+---+---+
   | 0 | 0 | 0 | Value Length (5+) |
   +---+---+---+-------------------+
   | Value String (Length Octets)  |
   +-------------------------------+

Literal Header with Substitution Indexing (Indexed Name), Integer Value

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 0 |      Index (6+)       |
   +---+---+-----------------------+
   |    Substituted Index (8+)     |
   +---+---+---+---+---+---+---+---+
   | 0 | 0 | 1 |    Value (5+)     |
   +---+---+---+---+---+---+---+---+

Literal Header with Substitution Indexing (Indexed Name), Timestamp Value

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 0 |      Index (6+)       |
   +---+---+-----------------------+
   |    Substituted Index (8+)     |
   +---+---+---+---+---+---+---+---+
   | 0 | 1 | 0 |    Value (5+)     |
   +---+---+---+---+---+---+---+---+

Literal Header with Substitution Indexing (Indexed Name), Legacy Value

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 0 |      Index (6+)       |
   +---+---+-----------------------+
   |    Substituted Index (8+)     |
   +---+---+---+---+---+---+---+---+
   | 1 | 0 | 0 | Value Length (5+) |
   +---+---+---+-------------------+
   | Value String (Length Octets)  |
   +-------------------------------+

Literal Header with Substitution Indexing (Indexed Name), Opaque Value

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 0 |      Index (6+)       |
   +---+---+-----------------------+
   |    Substituted Index (8+)     |
   +---+---+---+---+---+---+---+---+
   | 1 | 1 | 1 | Value Length (5+) |
   +---+---+---+-------------------+
   | Value Octets (Length Octets)  |
   +-------------------------------+


Literal Header with Substitution Indexing (New Name), UTF-8 Value

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 0 |           0           |
   +---+---+---+-------------------+
   |       Name Length (8+)        |
   +-------------------------------+
   |  Name String (Length octets)  |
   +---+---+-----------------------+
   |    Substituted Index (8+)     |
   +---+---+---+---+---+---+---+---+
   | 0 | 0 | 0 | Value Length (5+) |
   +---+---+---+-------------------+
   | Value String (Length Octets)  |
   +-------------------------------+

Literal Header with Substitution Indexing (New Name), Integer Value

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 0 |           0           |
   +---+---+---+-------------------+
   |       Name Length (8+)        |
   +-------------------------------+
   |  Name String (Length octets)  |
   +---+---+-----------------------+
   |    Substituted Index (8+)     |
   +---+---+---+---+---+---+---+---+
   | 0 | 0 | 1 |    Value (5+)     |
   +---+---+---+---+---+---+---+---+

Literal Header with Substitution Indexing (New Name), Timestamp Value

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 0 |           0           |
   +---+---+---+-------------------+
   |       Name Length (8+)        |
   +-------------------------------+
   |  Name String (Length octets)  |
   +---+---+-----------------------+
   |    Substituted Index (8+)     |
   +---+---+---+---+---+---+---+---+
   | 0 | 1 | 0 |    Value (5+)     |
   +---+---+---+---+---+---+---+---+

Literal Header with Substitution Indexing (New Name), Legacy Value

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 0 |           0           |
   +---+---+---+-------------------+
   |       Name Length (8+)        |
   +-------------------------------+
   |  Name String (Length octets)  |
   +---+---+-----------------------+
   |    Substituted Index (8+)     |
   +---+---+---+---+---+---+---+---+
   | 1 | 0 | 0 | Value Length (5+) |
   +---+---+---+-------------------+
   | Value String (Length Octets)  |
   +-------------------------------+

Literal Header with Substitution Indexing (New Name), Opaque Value

     0   1   2   3   4   5   6   7
   +---+---+---+---+---+---+---+---+
   | 0 | 0 |           0           |
   +---+---+---+-------------------+
   |       Name Length (8+)        |
   +-------------------------------+
   |  Name String (Length octets)  |
   +---+---+-----------------------+
   |    Substituted Index (8+)     |
   +---+---+---+---+---+---+---+---+
   | 1 | 1 | 1 | Value Length (5+) |
   +---+---+---+-------------------+
   | Value Octets (Length Octets)  |
   +-------------------------------+

For backwards compatibility with HTTP/1, the following conversion
rules would apply when translating an HTTP/2 request into HTTP/1

1. UTF-8 value encodings is converted to ASCII, with all non-ASCII
codepoints pct-encoded.
2. Legacy value encodings are passed through without any translation
3. Opaque values are Base64 encoded
4. Integer values are converted into the ASCII representation of the
number (e.g. 1234 = "1234")
5. Timestamp values are converted to the HTTP-date equivalent string,
with millisecond precision lost.

There is no normative translation from HTTP/1 to HTTP/2. All HTTP/1
header field values would be passed without translation using the
Legacy Value Encoding.

Header fields would have to be specifically defined at the http layer
to use the new value encodings. The definitions of existing HTTP/1
header fields remain the same unless those headers are specifically
redefined.

Some of the existing standard headers that we know can be redefined include:

* :status (integer)
* :host (legacy or utf-8)
* :path (legacy or utf-8)
* content-length (legacy or integer)
* date (legacy or timestamp)
* max-forwards (legacy or integer)
* retry-after (legacy, integer or timestamp)
* if-modified-since (legacy or timestamp)
* if-unmodified-since (legacy or timestamp)
* last-modified (legacy or timestamp)
* age (legacy or integer)
* expires (legacy, integer or timestamp)
* etag (legacy or opaque... an opaque etag is always "strong", never "weak")

There is a question about whether we really need to have any
distinction between UTF-8, Legacy and Opaque. First, it's important to
note that, with the exception of the first three bits, these three
types serialize exactly the same on the wire and are handled in
exactly the same way by the header compression mechanism. The type
distinction comes into play only when we want to use the value.

Allowing for UTF-8 gives developers the ability to use extended
characters.. for instance, we could use IRI's directly within our
requests without any translation to ASCII URI's. That means, passing
:host with an IDN rather than punycode equivalent; and passing :path
with extended characters without being required to do additional
pct-encoding.

Allowing for Legacy gives us a fallback for HTTP/1 compatibility.
There are very few enforced rules in HTTP/1 for values and we need to
have a way of indicating that the value being passed through is
untouched relative to HTTP/1 and follows all the same rules at HTTP/1.

Allowing for Opaque values gives us the opportunity to achieve highly
efficient encodings for things that, in the past, haven't been encoded
very efficiently. It allows us to pass raw binary sequences that ought
not to be interpreted as text necessarily.. avoiding the need to apply
additional base64 or hex encoding.

Lastly, I know there have been some discussions around structured
alternatives for timestamp encoding. Honestly, I've played around with
a number of variations and kept coming to the same conclusion:
milliseconds since the epoch encoded as a variable length integer
really is good enough. It keeps things simple while delivering a
feature many appdevs have been asking for for a very long time.

Received on Friday, 23 August 2013 16:00:52 UTC