Re: #445: Transfer-codings

On 10/04/2014 5:53 a.m., K.Morgan wrote:
> See my comments in-line below...
> 
> On Friday,04 April 2014 04:39, Matthew Kerwin wrote:
>>
>> Encoding: A 16 bit identifier which describes the encoding ...
> 
> I agree with what others have said - 16 bits is overkill.  I suggest one byte.
> 
>>
>> SETTINGS_ACCEPT_DATA_ENCODING (5): ...
>>
>> The Value field is further divided into two sub-fields, an unsigned 16 bit encoding
>> identifier and an unsigned 16-bit rank.
>>
> 
> 
>> On Friday,04 April 2014 06:34, Martin Thomson wrote (emphasis added):
> 
>>>  Settings have a single value. ... you will need to explain how values are processed,
> 
>>>  and how an implementation is able to limit the storage it dedicates to storage of this new setting.
> 
> I still think you didn't catch Martin's point. Theoretically a client has to store 64K x 4B of SETTINGS_ACCEPT_DATA_ENCODING values.


Where does that idea about 64K come from?

* The sender only has to remember as many as it wants to use.

* The receiver only has to remember the *overlap* between its own set
and the one provided by the sender as usable.

* following the SETTINGS enxchange both ends should have a small
negotiated set of encodings no greater than what it wanted to start with.

* In the realm of TE these settigns should only ever need negotiating
once on a connection.

Using an encoding outside the set negotiated and known at *both* ends
should be a PROTOCOL_ERROR.


> 
> I suggest a simpler approach.  Only allow a single value sub-divided into 4 bytes to announce support for up to 4 distinct encodings.  A new value received for this setting replaces the current setting.  This would allow the setting to remain a single 32-bit value. I also suggest you ditch the rank. The endpoint generating the compression should be allowed to decide the best compression scheme, if any, given the context.
> 

This either prohibits endpoints from using a mix of >4 different
encodings chosen to match different data types or security requirements
on a per-stream basis. Or forces constant re-negotiation of what
encodings can be or are being used by the active streams.

Also in a world where >4 encodings exist already this raises the chances
of no overlap being agreed on for some RTT. Causing worst-case transfer
to happen unnecessarily.

Without that ability to select a few encodings from an arbitrarily large
set and signal just one per-stream much of the supporting cases for TE
remain broken.


> In other words, I would suggest something like...
> 
> <<<<<<<
>  The Value field is further divided into four sub-fields, each representing
>  an unsigned 8 bit encoding identifier.
> 
>   0                   1                   2                   3
>   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>   | Encoding1 (8) | Encoding2 (8) | Encoding3 (8) | Encoding4 (8) |
>   +-------------------------------+-------------------------------+
> 
> An endpoint may advertise support for up to four encodings at any
> given time. Sending a new value for SETTINGS_ACCEPT_DATA_ENCODING
> replaces the previous value. An encoding value of 0 means "unused".
>>>>>>>>
> 
> (If you really wanted to keep the concept of rank, you could use the ordering of the four bytes as the rank.)
> 
> If an endpoint decides mid-connection they don't want to support compression any more for whatever reason (e.g. under heavy CPU load), it simply has to send a NULL value for this setting.
> 


All of the above remaining. I think the proposal for 8-bit with ranking
by order is solid.
 ALso, the 4-byte field on either HEADERS or DATA frames to indicate
coding for that message stream make sense as several TE can be
encapsulated in HTTP/1 semantics. Retaining that is cheap enough in this
format while also limiting the amount of encapsulation at 4 wrappers if
anyone would seek to abuse it.

Amos

Received on Thursday, 10 April 2014 04:26:49 UTC