W3C home > Mailing lists > Public > ietf-http-wg@w3.org > January to March 2014

Fwd: HPACK edge cases

From: Johnny Graettinger <jgraettinger@chromium.org>
Date: Fri, 7 Mar 2014 06:40:32 -0500
Message-ID: <CAEn92Tpg3zsQJ3-vkMeiuq=_5B=xu3cU_0SrK3EANwkez8XTUg@mail.gmail.com>
To: Ilari Liusvaara <ilari.liusvaara@elisanet.fi>
Cc: "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
<Resend from proper address>

Hi Ilari,

I can speak to how Chromium has implemented these. I agree that EOS leads
to edge cases.

On Fri, Mar 7, 2014 at 2:59 AM, Ilari Liusvaara <ilari.liusvaara@elisanet.fi
> wrote:

> Following are a few edge cases in HPACK I noticed:
> 1) EOS being decoded:
> The draft says:
> >   Given that only between 0-7 bits of the
> >   EOS symbol is included in any Huffman-encoded string, and given that
> >   the EOS symbol is at least 8 bits long, it is expected that it should
> >   never be successfully decoded.
> What if it IS decoded?
> - Error?
> - Treat the same as 0 (i.e. break header)?
> - Skip the symbol?

Chromium's current behavior is to treat it as padding. It's parsed, and

If one intermediary/server in chain skips it and another treats it as header
> break, bad things can happen.

Or if some client/intermediary/server crashes or behaves unpredictably if
> one
> is received.
> Idea: Eliminate EOS. Ensure that table has at least 8-bit all-zeroes
> symbol.

EOS has value iff it may be used for padding. As HEADERS etc now have an
explicit padding mechanism I'm also not sure it's needed. (Also: To pad the
table need only have an 8-bit code. As a canonical code, all-zeros would be
the shortest code in the table).

2) Padding not all ones.
> The draft says:
> >   When padding for Huffman encoding, the bits from the EOS (end-of-
> >   string) entry in the Huffman table are used, starting with the MSB
> >   (most significant bit).
> What if this is violated and padding is not all ones (that's the current
> prefix of EOS)?
> - Error?
> - Ignore?

When Chromium's decoder runs out of input, it doesn't require that the
remainder be a prefix for EOS--any prefix in the table is valid. It will
error however if it encounters a prefix which doesn't appear in the table
(ie, is larger than EOS or any other code).

3) Handling of large headers

I think you're talking about large in the sense of:
"A decoder needs to ensure that larger values or encodings of integers do
not permit exploitation."

As opposed to larger than the headers table. In the former sense, I think
"large" is up to the implementation and unknown to the remote.

> How does endpoint know how large header can be sent?
> - If huffman-compressed, is the limit in compressed or uncompressed size?

This is up to the implementation, but I think it's both, and additionally a
limit on the size of reconstructed headers (ie Cookie), even if individual
HPACK-encoded crumbs are small.

> - Is there difference if the header is to be added to header table or not?

If an implementation bound would be violated, I believe the only proper
handling is to abort with a connection error.

> Note that if there is header too large (but not so large that length
> overflows), the global effects on connection state can be computed
> with already committed resources, but the header can't be decoded.
> Also, headers spanning multiple HEADER/CONTINUATION frames might cause
> crashes or unpredictable behaviour in badly coded endpoints/
> intermediarities...
> Idea: Disallow headers spanning frames (this would limit header size
> to about 16kB-32kB).

I don't think I understand why representations spanning frames are a
particular source of difficulty.

Disallowing representations to span frames has a downside of forcing a
tight coupling between the framing layer and HPACK encoding context.

4) Handling of large header sets
> Endpoints can process large header sets in streaming manner, but are
> not required to. How can sender know how large header set can be
> sent?

I think this is another bound up to the implementation. I'm not sure the
sender can, without trying it.

If the sender did know how large a header set was acceptable, or how large
a header was acceptable, what would it do with that information?

Intermediaries have worse problem: If there are multiplexed connections,
> any request being sent monopolizes the connection, and thus overly
> large header sets are a DOS vector (the endpoint can just sink the
> header set, no matter how large).
> -Ilari

Received on Friday, 7 March 2014 11:41:00 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:14:24 UTC