Re: HPACK edge cases

On Fri, Mar 7, 2014 at 8:40 PM, Johnny Graettinger <
jgraettinger@chromium.org> wrote:

> <Resend from proper address>
>
> Hi Ilari,
>
> I can speak to how Chromium has implemented these. I agree that EOS leads
> to edge cases.
>
>
> On Fri, Mar 7, 2014 at 2:59 AM, Ilari Liusvaara <
> ilari.liusvaara@elisanet.fi> wrote:
>
>> Following are a few edge cases in HPACK I noticed:
>>
>> 1) EOS being decoded:
>>
>> The draft says:
>>
>> >   Given that only between 0-7 bits of the
>> >   EOS symbol is included in any Huffman-encoded string, and given that
>> >   the EOS symbol is at least 8 bits long, it is expected that it should
>> >   never be successfully decoded.
>>
>> What if it IS decoded?
>> - Error?
>> - Treat the same as 0 (i.e. break header)?
>> - Skip the symbol?
>>
>
> Chromium's current behavior is to treat it as padding. It's parsed, and
> skipped.
>
>
>  If one intermediary/server in chain skips it and another treats it as
>> header
>> break, bad things can happen.
>
> Or if some client/intermediary/server crashes or behaves unpredictably if
>> one
>> is received.
>>
>>
>> Idea: Eliminate EOS. Ensure that table has at least 8-bit all-zeroes
>> symbol.
>>
>
> EOS has value iff it may be used for padding. As HEADERS etc now have an
> explicit padding mechanism I'm also not sure it's needed. (Also: To pad the
> table need only have an 8-bit code. As a canonical code, all-zeros would be
> the shortest code in the table).
>
>
A while ago, we discussed this in the ML.
http://lists.w3.org/Archives/Public/ietf-http-wg/2013OctDec/1866.html

In short, strictly more than 7 bits of 1 if encountered in decoding must be
treated as error.

Best regards,
Tatsuhiro Tsujikawa




>
> 2) Padding not all ones.
>>
>> The draft says:
>>
>> >   When padding for Huffman encoding, the bits from the EOS (end-of-
>> >   string) entry in the Huffman table are used, starting with the MSB
>> >   (most significant bit).
>>
>> What if this is violated and padding is not all ones (that's the current
>> prefix of EOS)?
>> - Error?
>> - Ignore?
>>
>
> When Chromium's decoder runs out of input, it doesn't require that the
> remainder be a prefix for EOS--any prefix in the table is valid. It will
> error however if it encounters a prefix which doesn't appear in the table
> (ie, is larger than EOS or any other code).
>
>
> 3) Handling of large headers
>>
>
> I think you're talking about large in the sense of:
> "A decoder needs to ensure that larger values or encodings of integers do
> not permit exploitation."
>
> As opposed to larger than the headers table. In the former sense, I think
> "large" is up to the implementation and unknown to the remote.
>
>
>> How does endpoint know how large header can be sent?
>> - If huffman-compressed, is the limit in compressed or uncompressed size?
>>
>
> This is up to the implementation, but I think it's both, and additionally
> a limit on the size of reconstructed headers (ie Cookie), even if
> individual HPACK-encoded crumbs are small.
>
>
>> - Is there difference if the header is to be added to header table or not?
>>
>
> If an implementation bound would be violated, I believe the only proper
> handling is to abort with a connection error.
>
>
>
>> Note that if there is header too large (but not so large that length
>> overflows), the global effects on connection state can be computed
>> with already committed resources, but the header can't be decoded.
>>
>> Also, headers spanning multiple HEADER/CONTINUATION frames might cause
>> crashes or unpredictable behaviour in badly coded endpoints/
>> intermediarities...
>>
>> Idea: Disallow headers spanning frames (this would limit header size
>> to about 16kB-32kB).
>>
>
> I don't think I understand why representations spanning frames are a
> particular source of difficulty.
>
> Disallowing representations to span frames has a downside of forcing a
> tight coupling between the framing layer and HPACK encoding context.
>
>
> 4) Handling of large header sets
>>
>> Endpoints can process large header sets in streaming manner, but are
>> not required to. How can sender know how large header set can be
>> sent?
>>
>
> I think this is another bound up to the implementation. I'm not sure the
> sender can, without trying it.
>
> If the sender did know how large a header set was acceptable, or how large
> a header was acceptable, what would it do with that information?
>
> Intermediaries have worse problem: If there are multiplexed connections,
>> any request being sent monopolizes the connection, and thus overly
>> large header sets are a DOS vector (the endpoint can just sink the
>> header set, no matter how large).
>>
>>
>> -Ilari
>>
>
> cheers,
> -johnny
>
>

Received on Friday, 7 March 2014 12:12:43 UTC