RE: Type codecs within hpack

Yes, cut n paste error on my part.
On Aug 23, 2013 2:32 PM, "Mike Bishop" <Michael.Bishop@microsoft.com> wrote:

> Integer and Timestamp are defined with the same leading three bits as
> well.  I assume that's a typo.
>
> -----Original Message-----
> From: James M Snell [mailto:jasnell@gmail.com]
> Sent: Friday, August 23, 2013 9:00 AM
> To: ietf-http-wg@w3.org
> Subject: Type codecs within hpack
>
> With the assumption that hpack is what we'll ultimately end up sticking
> with for header encoding, I wanted to take a moment to illustrate how the
> binary type codecs would look within the hpack encoding... (note, this
> email is **NOT** discussing the alternative header compression I describe
> in my separate I-D.. this is talking about applying the value type
> encodings to hpack).
>
> It would be very helpful if the various implementers would give some kind
> of indication about whether they'd be willing to implement these encodings
> and, if so, when.
>
> First, a quick note: The type codecs would be defined independently of the
> compression mechanism itself... that is, just as Roberto's been wanting, we
> can separate header value type details out of hpack entirely so that hpack
> can deal specifically with header compression.
>
> Specifically, hpack would be updated such that it would not define a
> specific encoding for header field values. As far as the basic hpack
> mechanism is concerned, the value encoding would be an opaque octet
> sequence.
>
> Literal Header without Indexing (Indexed Name):
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 1 | 1 |    Index (5+)     |
>    +---+---+---+-------------------+
>    |      Value Encoding (8+)      |
>    +-------------------------------+
>
> Literal Header without Indexing (New Name):
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 1 | 1 |         0         |
>    +---+---+---+-------------------+
>    |       Name Length (8+)        |
>    +-------------------------------+
>    |  Name String (Length octets)  |
>    +-------------------------------+
>    |      Value Encoding (8+)      |
>    +-------------------------------+
>
> Literal Header with Incremental Indexing (Indexed Name):
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 1 | 0 |    Index (5+)     |
>    +---+---+---+-------------------+
>    |      Value Encoding (8+)      |
>    +-------------------------------+
>
> Literal Header with Incremental Indexing (New Name):
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 1 | 0 |         0         |
>    +---+---+---+-------------------+
>    |       Name Length (8+)        |
>    +-------------------------------+
>    |  Name String (Length octets)  |
>    +-------------------------------+
>    |      Value Encoding (8+)      |
>    +-------------------------------+
>
> Literal Header with Substitution Indexing (Indexed Name):
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 0 |      Index (6+)       |
>    +---+---+-----------------------+
>    |    Substituted Index (8+)     |
>    +-------------------------------+
>    |      Value Encoding (8+)      |
>    +-------------------------------+
>
> Literal Header with Substitution Indexing (New Name):
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 0 |           0           |
>    +---+---+-----------------------+
>    |       Name Length (8+)        |
>    +-------------------------------+
>    |  Name String (Length octets)  |
>    +-------------------------------+
>    |    Substituted Index (8+)     |
>    +-------------------------------+
>    |      Value Encoding (8+)      |
>    +-------------------------------+
>
>
> There are five possible value encodings which manifest two basic
> patterns... The value types are:
>
> 1. UTF-8
> 2. Legacy
> 3. Opaque
> 4. Integer
> 5. Timestamp
>
> On the wire these look like:
>
> UTF-8
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 0 | 0 | Value Length (5+) |
>    +---+---+---+-------------------+
>    | Value String (Length Octets)  |
>    +-------------------------------+
>
> Integer
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 0 | 1 |    Value (5+)     |
>    +---+---+---+---+---+---+---+---+
>
> Timestamp
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 0 | 1 |    Value (5+)     |
>    +---+---+---+---+---+---+---+---+
>
> Legacy
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 1 | 0 | 0 | Value Length (5+) |
>    +-------------------------------+
>    | Value String (Length Octets)  |
>    +-------------------------------+
>
> Opaque
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 1 | 1 | 1 | Value Length (5+) |
>    +-------------------------------+
>    | Value Octets (Length Octets)  |
>    +-------------------------------+
>
>
> Notice that the UTF-8, Legacy and Opaque type encodings are identical
> other than the leading three-bits. Likewise, the Integer and Timestamp
> encodings are identical other than the leading three bits.
>
> For UTF-8, Legacy and Opaque, the Value Length is encoding as an integer
> with a 5-bit prefix.
>
> For Integer, the Value is encoded as an integer with a 5-bit prefix.
> Negative or fractional values cannot be represented. Theoretically there
> is no upper value limit to this encoding, however, it would likely be good
> to recommend that only values up to 64-bits are encoded.
>
> For Timestamp, the number of milliseconds since the epoch is encoded as an
> integer with a 5-bit prefix. Dates before the epoch cannot be represented.
>
> If we put these types together with hpack, this is what we end up with:
>
> Literal Header without Indexing (Indexed Name), UTF-8 Value
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 1 | 1 |    Index (5+)     |
>    +---+---+---+---+---+---+---+---+
>    | 0 | 0 | 0 | Value Length (5+) |
>    +---+---+---+-------------------+
>    | Value String (Length Octets)  |
>    +-------------------------------+
>
> Literal Header without Indexing (Indexed Name), Integer Value
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 1 | 1 |    Index (5+)     |
>    +---+---+---+---+---+---+---+---+
>    | 0 | 0 | 1 |    Value (5+)     |
>    +---+---+---+---+---+---+---+---+
>
> Literal Header without Indexing (Indexed Name), Timestamp Value
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 1 | 1 |    Index (5+)     |
>    +---+---+---+---+---+---+---+---+
>    | 0 | 1 | 0 |    Value (5+)     |
>    +---+---+---+---+---+---+---+---+
>
> Literal Header without Indexing (Indexed Name), Legacy Value
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 1 | 1 |    Index (5+)     |
>    +---+---+---+---+---+---+---+---+
>    | 1 | 0 | 0 | Value Length (5+) |
>    +---+---+---+-------------------+
>    | Value String (Length Octets)  |
>    +-------------------------------+
>
> Literal Header without Indexing (Indexed Name), Opaque Value
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 1 | 1 |    Index (5+)     |
>    +---+---+---+---+---+---+---+---+
>    | 1 | 1 | 1 | Value Length (5+) |
>    +---+---+---+-------------------+
>    | Value Octets (Length Octets)  |
>    +-------------------------------+
>
> Literal Header without Indexing (New Name), UTF-8 Value
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 1 | 1 |         0         |
>    +---+---+---+-------------------+
>    |       Name Length (8+)        |
>    +-------------------------------+
>    |  Name String (Length octets)  |
>    +---+---+---+---+---+---+---+---+
>    | 0 | 0 | 0 | Value Length (5+) |
>    +---+---+---+-------------------+
>    | Value String (Length Octets)  |
>    +-------------------------------+
>
> Literal Header without Indexing (New Name), Integer Value
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 1 | 1 |         0         |
>    +---+---+---+-------------------+
>    |       Name Length (8+)        |
>    +-------------------------------+
>    |  Name String (Length octets)  |
>    +---+---+---+---+---+---+---+---+
>    | 0 | 0 | 1 |    Value (5+)     |
>    +---+---+---+---+---+---+---+---+
>
> Literal Header without Indexing (New Name), Timestamp Value
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 1 | 1 |         0         |
>    +---+---+---+-------------------+
>    |       Name Length (8+)        |
>    +-------------------------------+
>    |  Name String (Length octets)  |
>    +---+---+---+---+---+---+---+---+
>    | 0 | 1 | 0 |    Value (5+)     |
>    +---+---+---+---+---+---+---+---+
>
> Literal Header without Indexing (New Name), Legacy Value
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 1 | 1 |         0         |
>    +---+---+---+-------------------+
>    |       Name Length (8+)        |
>    +-------------------------------+
>    |  Name String (Length octets)  |
>    +---+---+---+---+---+---+---+---+
>    | 1 | 0 | 0 | Value Length (5+) |
>    +---+---+---+-------------------+
>    | Value String (Length Octets)  |
>    +-------------------------------+
>
> Literal Header without Indexing (New Name), Opaque Value
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 1 | 1 |         0         |
>    +---+---+---+-------------------+
>    |       Name Length (8+)        |
>    +-------------------------------+
>    |  Name String (Length octets)  |
>    +---+---+---+---+---+---+---+---+
>    | 1 | 1 | 1 | Value Length (5+) |
>    +---+---+---+-------------------+
>    | Value Octets (Length Octets)  |
>    +-------------------------------+
>
> Literal Header with Incremental Indexing (Indexed Name), UTF-8 Value
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 1 | 0 |    Index (5+)     |
>    +---+---+---+---+---+---+---+---+
>    | 0 | 0 | 0 | Value Length (5+) |
>    +---+---+---+-------------------+
>    | Value String (Length Octets)  |
>    +-------------------------------+
>
> Literal Header with Incremental Indexing (Indexed Name), Integer Value
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 1 | 0 |    Index (5+)     |
>    +---+---+---+---+---+---+---+---+
>    | 0 | 0 | 1 |    Value (5+)     |
>    +---+---+---+---+---+---+---+---+
>
> Literal Header with Incremental Indexing (Indexed Name), Timestamp Value
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 1 | 0 |    Index (5+)     |
>    +---+---+---+---+---+---+---+---+
>    | 0 | 1 | 0 |    Value (5+)     |
>    +---+---+---+---+---+---+---+---+
>
> Literal Header with Incremental Indexing (Indexed Name), Legacy Value
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 1 | 0 |    Index (5+)     |
>    +---+---+---+---+---+---+---+---+
>    | 1 | 0 | 0 | Value Length (5+) |
>    +---+---+---+-------------------+
>    | Value String (Length Octets)  |
>    +-------------------------------+
>
> Literal Header with Incremental Indexing (Indexed Name), Opaque Value
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 1 | 0 |    Index (5+)     |
>    +---+---+---+---+---+---+---+---+
>    | 1 | 1 | 1 | Value Length (5+) |
>    +---+---+---+-------------------+
>    | Value Octets (Length Octets)  |
>    +-------------------------------+
>
> Literal Header with Incremental Indexing (New Name), UTF-8 Value
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 1 | 0 |         0         |
>    +---+---+---+-------------------+
>    |       Name Length (8+)        |
>    +-------------------------------+
>    |  Name String (Length octets)  |
>    +---+---+---+---+---+---+---+---+
>    | 0 | 0 | 0 | Value Length (5+) |
>    +---+---+---+-------------------+
>    | Value String (Length Octets)  |
>    +-------------------------------+
>
> Literal Header with Incremental Indexing (New Name), Integer Value
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 1 | 0 |         0         |
>    +---+---+---+-------------------+
>    |       Name Length (8+)        |
>    +-------------------------------+
>    |  Name String (Length octets)  |
>    +---+---+---+---+---+---+---+---+
>    | 0 | 0 | 1 |    Value (5+)     |
>    +---+---+---+---+---+---+---+---+
>
> Literal Header with Incremental Indexing (New Name), Timestamp Value
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 1 | 0 |         0         |
>    +---+---+---+-------------------+
>    |       Name Length (8+)        |
>    +-------------------------------+
>    |  Name String (Length octets)  |
>    +---+---+---+---+---+---+---+---+
>    | 0 | 1 | 0 |    Value (5+)     |
>    +---+---+---+---+---+---+---+---+
>
> Literal Header with Incremental Indexing (New Name), Legacy Value
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 1 | 0 |         0         |
>    +---+---+---+-------------------+
>    |       Name Length (8+)        |
>    +-------------------------------+
>    |  Name String (Length octets)  |
>    +---+---+---+---+---+---+---+---+
>    | 1 | 0 | 0 | Value Length (5+) |
>    +---+---+---+-------------------+
>    | Value String (Length Octets)  |
>    +-------------------------------+
>
>
> Literal Header with Incremental Indexing (Indexed Name), Opaque Value
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 1 | 0 |         0         |
>    +---+---+---+-------------------+
>    |       Name Length (8+)        |
>    +-------------------------------+
>    |  Name String (Length octets)  |
>    +---+---+---+---+---+---+---+---+
>    | 1 | 1 | 1 | Value Length (5+) |
>    +---+---+---+-------------------+
>    | Value Octets (Length Octets)  |
>    +-------------------------------+
>
>
> Literal Header with Substitution Indexing (Indexed Name), UTF-8 Value
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 0 |      Index (6+)       |
>    +---+---+-----------------------+
>    |    Substituted Index (8+)     |
>    +---+---+---+---+---+---+---+---+
>    | 0 | 0 | 0 | Value Length (5+) |
>    +---+---+---+-------------------+
>    | Value String (Length Octets)  |
>    +-------------------------------+
>
> Literal Header with Substitution Indexing (Indexed Name), Integer Value
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 0 |      Index (6+)       |
>    +---+---+-----------------------+
>    |    Substituted Index (8+)     |
>    +---+---+---+---+---+---+---+---+
>    | 0 | 0 | 1 |    Value (5+)     |
>    +---+---+---+---+---+---+---+---+
>
> Literal Header with Substitution Indexing (Indexed Name), Timestamp Value
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 0 |      Index (6+)       |
>    +---+---+-----------------------+
>    |    Substituted Index (8+)     |
>    +---+---+---+---+---+---+---+---+
>    | 0 | 1 | 0 |    Value (5+)     |
>    +---+---+---+---+---+---+---+---+
>
> Literal Header with Substitution Indexing (Indexed Name), Legacy Value
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 0 |      Index (6+)       |
>    +---+---+-----------------------+
>    |    Substituted Index (8+)     |
>    +---+---+---+---+---+---+---+---+
>    | 1 | 0 | 0 | Value Length (5+) |
>    +---+---+---+-------------------+
>    | Value String (Length Octets)  |
>    +-------------------------------+
>
> Literal Header with Substitution Indexing (Indexed Name), Opaque Value
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 0 |      Index (6+)       |
>    +---+---+-----------------------+
>    |    Substituted Index (8+)     |
>    +---+---+---+---+---+---+---+---+
>    | 1 | 1 | 1 | Value Length (5+) |
>    +---+---+---+-------------------+
>    | Value Octets (Length Octets)  |
>    +-------------------------------+
>
>
> Literal Header with Substitution Indexing (New Name), UTF-8 Value
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 0 |           0           |
>    +---+---+---+-------------------+
>    |       Name Length (8+)        |
>    +-------------------------------+
>    |  Name String (Length octets)  |
>    +---+---+-----------------------+
>    |    Substituted Index (8+)     |
>    +---+---+---+---+---+---+---+---+
>    | 0 | 0 | 0 | Value Length (5+) |
>    +---+---+---+-------------------+
>    | Value String (Length Octets)  |
>    +-------------------------------+
>
> Literal Header with Substitution Indexing (New Name), Integer Value
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 0 |           0           |
>    +---+---+---+-------------------+
>    |       Name Length (8+)        |
>    +-------------------------------+
>    |  Name String (Length octets)  |
>    +---+---+-----------------------+
>    |    Substituted Index (8+)     |
>    +---+---+---+---+---+---+---+---+
>    | 0 | 0 | 1 |    Value (5+)     |
>    +---+---+---+---+---+---+---+---+
>
> Literal Header with Substitution Indexing (New Name), Timestamp Value
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 0 |           0           |
>    +---+---+---+-------------------+
>    |       Name Length (8+)        |
>    +-------------------------------+
>    |  Name String (Length octets)  |
>    +---+---+-----------------------+
>    |    Substituted Index (8+)     |
>    +---+---+---+---+---+---+---+---+
>    | 0 | 1 | 0 |    Value (5+)     |
>    +---+---+---+---+---+---+---+---+
>
> Literal Header with Substitution Indexing (New Name), Legacy Value
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 0 |           0           |
>    +---+---+---+-------------------+
>    |       Name Length (8+)        |
>    +-------------------------------+
>    |  Name String (Length octets)  |
>    +---+---+-----------------------+
>    |    Substituted Index (8+)     |
>    +---+---+---+---+---+---+---+---+
>    | 1 | 0 | 0 | Value Length (5+) |
>    +---+---+---+-------------------+
>    | Value String (Length Octets)  |
>    +-------------------------------+
>
> Literal Header with Substitution Indexing (New Name), Opaque Value
>
>      0   1   2   3   4   5   6   7
>    +---+---+---+---+---+---+---+---+
>    | 0 | 0 |           0           |
>    +---+---+---+-------------------+
>    |       Name Length (8+)        |
>    +-------------------------------+
>    |  Name String (Length octets)  |
>    +---+---+-----------------------+
>    |    Substituted Index (8+)     |
>    +---+---+---+---+---+---+---+---+
>    | 1 | 1 | 1 | Value Length (5+) |
>    +---+---+---+-------------------+
>    | Value Octets (Length Octets)  |
>    +-------------------------------+
>
> For backwards compatibility with HTTP/1, the following conversion rules
> would apply when translating an HTTP/2 request into HTTP/1
>
> 1. UTF-8 value encodings is converted to ASCII, with all non-ASCII
> codepoints pct-encoded.
> 2. Legacy value encodings are passed through without any translation 3.
> Opaque values are Base64 encoded 4. Integer values are converted into the
> ASCII representation of the number (e.g. 1234 = "1234") 5. Timestamp values
> are converted to the HTTP-date equivalent string, with millisecond
> precision lost.
>
> There is no normative translation from HTTP/1 to HTTP/2. All HTTP/1 header
> field values would be passed without translation using the Legacy Value
> Encoding.
>
> Header fields would have to be specifically defined at the http layer to
> use the new value encodings. The definitions of existing HTTP/1 header
> fields remain the same unless those headers are specifically redefined.
>
> Some of the existing standard headers that we know can be redefined
> include:
>
> * :status (integer)
> * :host (legacy or utf-8)
> * :path (legacy or utf-8)
> * content-length (legacy or integer)
> * date (legacy or timestamp)
> * max-forwards (legacy or integer)
> * retry-after (legacy, integer or timestamp)
> * if-modified-since (legacy or timestamp)
> * if-unmodified-since (legacy or timestamp)
> * last-modified (legacy or timestamp)
> * age (legacy or integer)
> * expires (legacy, integer or timestamp)
> * etag (legacy or opaque... an opaque etag is always "strong", never
> "weak")
>
> There is a question about whether we really need to have any distinction
> between UTF-8, Legacy and Opaque. First, it's important to note that, with
> the exception of the first three bits, these three types serialize exactly
> the same on the wire and are handled in exactly the same way by the header
> compression mechanism. The type distinction comes into play only when we
> want to use the value.
>
> Allowing for UTF-8 gives developers the ability to use extended
> characters.. for instance, we could use IRI's directly within our requests
> without any translation to ASCII URI's. That means, passing :host with an
> IDN rather than punycode equivalent; and passing :path with extended
> characters without being required to do additional pct-encoding.
>
> Allowing for Legacy gives us a fallback for HTTP/1 compatibility.
> There are very few enforced rules in HTTP/1 for values and we need to have
> a way of indicating that the value being passed through is untouched
> relative to HTTP/1 and follows all the same rules at HTTP/1.
>
> Allowing for Opaque values gives us the opportunity to achieve highly
> efficient encodings for things that, in the past, haven't been encoded very
> efficiently. It allows us to pass raw binary sequences that ought not to be
> interpreted as text necessarily.. avoiding the need to apply additional
> base64 or hex encoding.
>
> Lastly, I know there have been some discussions around structured
> alternatives for timestamp encoding. Honestly, I've played around with a
> number of variations and kept coming to the same conclusion:
> milliseconds since the epoch encoded as a variable length integer really
> is good enough. It keeps things simple while delivering a feature many
> appdevs have been asking for for a very long time.
>
>

Received on Friday, 23 August 2013 21:37:05 UTC