- From: Johnny Graettinger <jgraettinger@chromium.org>
- Date: Fri, 7 Mar 2014 06:40:32 -0500
- To: Ilari Liusvaara <ilari.liusvaara@elisanet.fi>
- Cc: "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
- Message-ID: <CAEn92Tpg3zsQJ3-vkMeiuq=_5B=xu3cU_0SrK3EANwkez8XTUg@mail.gmail.com>
<Resend from proper address> Hi Ilari, I can speak to how Chromium has implemented these. I agree that EOS leads to edge cases. On Fri, Mar 7, 2014 at 2:59 AM, Ilari Liusvaara <ilari.liusvaara@elisanet.fi > wrote: > Following are a few edge cases in HPACK I noticed: > > 1) EOS being decoded: > > The draft says: > > > Given that only between 0-7 bits of the > > EOS symbol is included in any Huffman-encoded string, and given that > > the EOS symbol is at least 8 bits long, it is expected that it should > > never be successfully decoded. > > What if it IS decoded? > - Error? > - Treat the same as 0 (i.e. break header)? > - Skip the symbol? > Chromium's current behavior is to treat it as padding. It's parsed, and skipped. If one intermediary/server in chain skips it and another treats it as header > break, bad things can happen. Or if some client/intermediary/server crashes or behaves unpredictably if > one > is received. > > > Idea: Eliminate EOS. Ensure that table has at least 8-bit all-zeroes > symbol. > EOS has value iff it may be used for padding. As HEADERS etc now have an explicit padding mechanism I'm also not sure it's needed. (Also: To pad the table need only have an 8-bit code. As a canonical code, all-zeros would be the shortest code in the table). 2) Padding not all ones. > > The draft says: > > > When padding for Huffman encoding, the bits from the EOS (end-of- > > string) entry in the Huffman table are used, starting with the MSB > > (most significant bit). > > What if this is violated and padding is not all ones (that's the current > prefix of EOS)? > - Error? > - Ignore? > When Chromium's decoder runs out of input, it doesn't require that the remainder be a prefix for EOS--any prefix in the table is valid. It will error however if it encounters a prefix which doesn't appear in the table (ie, is larger than EOS or any other code). 3) Handling of large headers > I think you're talking about large in the sense of: "A decoder needs to ensure that larger values or encodings of integers do not permit exploitation." As opposed to larger than the headers table. In the former sense, I think "large" is up to the implementation and unknown to the remote. > How does endpoint know how large header can be sent? > - If huffman-compressed, is the limit in compressed or uncompressed size? > This is up to the implementation, but I think it's both, and additionally a limit on the size of reconstructed headers (ie Cookie), even if individual HPACK-encoded crumbs are small. > - Is there difference if the header is to be added to header table or not? > If an implementation bound would be violated, I believe the only proper handling is to abort with a connection error. > Note that if there is header too large (but not so large that length > overflows), the global effects on connection state can be computed > with already committed resources, but the header can't be decoded. > > Also, headers spanning multiple HEADER/CONTINUATION frames might cause > crashes or unpredictable behaviour in badly coded endpoints/ > intermediarities... > > Idea: Disallow headers spanning frames (this would limit header size > to about 16kB-32kB). > I don't think I understand why representations spanning frames are a particular source of difficulty. Disallowing representations to span frames has a downside of forcing a tight coupling between the framing layer and HPACK encoding context. 4) Handling of large header sets > > Endpoints can process large header sets in streaming manner, but are > not required to. How can sender know how large header set can be > sent? > I think this is another bound up to the implementation. I'm not sure the sender can, without trying it. If the sender did know how large a header set was acceptable, or how large a header was acceptable, what would it do with that information? Intermediaries have worse problem: If there are multiplexed connections, > any request being sent monopolizes the connection, and thus overly > large header sets are a DOS vector (the endpoint can just sink the > header set, no matter how large). > > > -Ilari > cheers, -johnny
Received on Friday, 7 March 2014 11:41:00 UTC