#445: Transfer-codings

Here is my proposal for reintroducing transport-level compression:


6.1 DATA

New DATA Frame Payload:

```
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | Pad High? (8) |  Pad Low? (8) |
 +---------------+---------------|
 |        Encoding? (16)         |
 +-------------------------------+-------------------------------+
 |                            Data (*)                         ...
 +---------------------------------------------------------------+
 |                           Padding (*)                       ...
 +---------------------------------------------------------------+
```

With the description:

```
Encoding: A 16 bit identifier which describes the encoding that has been
applied to the Data field. This field is optional and is only present if
the ENCODED flag is set.  A sender MUST NOT apply an encoding that has not
first been advertised by the peer in a SETTINGS_ACCEPT_DATA_ENCODING
settings frame, or was advertised with a rank of 0. Endpoints that receive
a frame with an encoding identifier they do not support MUST treat this is
a connection error of type PROTOCOL_ERROR.
```

And a new flag:

```
ENCODED (0x20): Bit 6 being set indicates that the Encoding field is
present and describes the encoding that has been applied to the Data field.
```

This is intentionally designed to make it easy for people to, for example,
detect the flag and immediately PROTOCOL_ERROR.  Hopefully it's not too
much skin off anyone's nose.



6.5.2 Defined SETTINGS Parameters

New parameter:

```
SETTINGS_ACCEPT_DATA_ENCODING (5): Indicates the sender's ability and
willingness to receive DATA frames that are encoded using the scheme
identified in the Value.

The Value field is further divided into two sub-fields, an unsigned 16 bit
encoding identifier and an unsigned 16-bit rank.

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |         Encoding (16)         |           Rank (16)           |
 +-------------------------------+-------------------------------+

The following encodings are defined:

  ENCODING_COMPRESS (1):
   The "compress" coding is an adaptive Lempel-Ziv-Welch (LZW) coding that
is commonly produced by the UNIX file compression program "compress".

  ENCODING_DEFLATE (2):
   The "deflate" coding is a "zlib" data format [RFC1950] containing a
"deflate"
compressed data stream [RFC1951] that uses a combination of the Lempel-Ziv
(LZ77) compression algorithm and Huffman coding.

  ENCODING_GZIP (3):
   The "gzip" coding is an LZ77 coding with a 32 bit CRC that is commonly
produced by the gzip file compression program [RFC1952].

An endpoint MAY ignore a SETTINGS_ACCEPT_DATA_ENCODING parameter with an
encoding identifier it does not recognise or support.

The rank fulfils the same role as in the HTTP/1.1 TE header. The rank value
is an integer in the range 0 through 65,535, where 1 is the least preferred
and 65,535 is the most preferred; a value of 0 means "not acceptable".
```

The three encoding schemes are taken straight from HTTPbis-p1 section 4.2.
 Again, it's designed specifically so that people can ignore it if they
don't want to play along; but it grants us adventurous folk the chance to
play.


8.1.3.5 Malformed Messages

Clarify the content-length:

```
A request or response that includes an entity body can include a
content-length header field. A request or response is also malformed if the
value of a content-length header field does not equal the sum of the DATA
frame payload lengths, after any encoding is removed, that form the body.
```

Cheers
-- 
  Matthew Kerwin
  http://matthew.kerwin.net.au/

Received on Friday, 4 April 2014 02:39:40 UTC