Re: #445: Transfer-codings

My personal .02 - 

Given that HTTP/1 hasn't seen much use of transfer-codings, and over the years we've had them, there have only been three defined (discounting chunked), spending 16 bits on this in ever data frame doesn't seem justified.

I'd much rather just have a flag that indicates 'gzip', and a corresponding setting; that doesn't require any frame format changes at all. If another encoding becomes necessary, it can get into a subsequent version (since we've repeatedly decided to favour faster versioning over broad extensibility in the non-semantic layer).

Also, how should a recipient handle a stream that has DATA frames with different values for encoding?

Regards,


On 4 Apr 2014, at 1:39 pm, Matthew Kerwin <matthew@kerwin.net.au> wrote:

> Here is my proposal for reintroducing transport-level compression:
> 
> 
> 6.1 DATA
> 
> New DATA Frame Payload:
> 
> ```
>  0                   1                   2                   3
>  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
>  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  | Pad High? (8) |  Pad Low? (8) |
>  +---------------+---------------|
>  |        Encoding? (16)         |
>  +-------------------------------+-------------------------------+
>  |                            Data (*)                         ...
>  +---------------------------------------------------------------+
>  |                           Padding (*)                       ...
>  +---------------------------------------------------------------+
> ```
> 
> With the description:
> 
> ```
> Encoding: A 16 bit identifier which describes the encoding that has been applied to the Data field. This field is optional and is only present if the ENCODED flag is set.  A sender MUST NOT apply an encoding that has not first been advertised by the peer in a SETTINGS_ACCEPT_DATA_ENCODING settings frame, or was advertised with a rank of 0. Endpoints that receive a frame with an encoding identifier they do not support MUST treat this is a connection error of type PROTOCOL_ERROR.
> ```
> 
> And a new flag:
> 
> ```
> ENCODED (0x20): Bit 6 being set indicates that the Encoding field is present and describes the encoding that has been applied to the Data field.
> ```
> 
> This is intentionally designed to make it easy for people to, for example, detect the flag and immediately PROTOCOL_ERROR.  Hopefully it's not too much skin off anyone's nose.
> 
> 
> 
> 6.5.2 Defined SETTINGS Parameters
> 
> New parameter:
> 
> ```
> SETTINGS_ACCEPT_DATA_ENCODING (5): Indicates the sender's ability and willingness to receive DATA frames that are encoded using the scheme identified in the Value.
> 
> The Value field is further divided into two sub-fields, an unsigned 16 bit encoding identifier and an unsigned 16-bit rank.
> 
>  0                   1                   2                   3
>  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
>  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  |         Encoding (16)         |           Rank (16)           |
>  +-------------------------------+-------------------------------+
> 
> The following encodings are defined:
> 
>   ENCODING_COMPRESS (1):
>    The "compress" coding is an adaptive Lempel-Ziv-Welch (LZW) coding that is commonly produced by the UNIX file compression program "compress".
> 
>   ENCODING_DEFLATE (2):
>    The "deflate" coding is a "zlib" data format [RFC1950] containing a "deflate" compressed data stream [RFC1951] that uses a combination of the Lempel-Ziv (LZ77) compression algorithm and Huffman coding.
> 
>   ENCODING_GZIP (3):
>    The "gzip" coding is an LZ77 coding with a 32 bit CRC that is commonly produced by the gzip file compression program [RFC1952].
> 
> An endpoint MAY ignore a SETTINGS_ACCEPT_DATA_ENCODING parameter with an encoding identifier it does not recognise or support.
> 
> The rank fulfils the same role as in the HTTP/1.1 TE header. The rank value is an integer in the range 0 through 65,535, where 1 is the least preferred and 65,535 is the most preferred; a value of 0 means "not acceptable".
> ```
> 
> The three encoding schemes are taken straight from HTTPbis-p1 section 4.2.  Again, it's designed specifically so that people can ignore it if they don't want to play along; but it grants us adventurous folk the chance to play.
> 
> 
> 8.1.3.5 Malformed Messages
> 
> Clarify the content-length:
> 
> ```
> A request or response that includes an entity body can include a content-length header field. A request or response is also malformed if the value of a content-length header field does not equal the sum of the DATA frame payload lengths, after any encoding is removed, that form the body.
> ```
> 
> Cheers
> -- 
>   Matthew Kerwin
>   http://matthew.kerwin.net.au/

--
Mark Nottingham   http://www.mnot.net/

Received on Friday, 4 April 2014 05:09:30 UTC