Re: Sections 3.3.2 and 3.3.3 allow bogus Content-Length?

> On 15 Feb 2017, at 07:23, Poul-Henning Kamp <phk@phk.freebsd.dk> wrote:
> 
> My 0.02 DKK:
> 
> In H1 Content-Length is part of the "framing layer", and therefore
> we should not be tolerant of senders transmitting bogus values,
> just like we don't allow H2 senders the option of bogus frame length.

My 0.02 GBP is in agreement with PHK here. Content-Length is not only metadata, it is also data about the framing. A Content-Length header defines where the request body ends (except in the specific few exceptions as defined in the RFCs, which are well-known and understood). Any appeal to the platonic ideal of what a resource should look like is nonsensical.

> This is where skipping this discussion during the H2 phase have
> created a new class of problems of us:  In H2 the body has a true
> length, as delinieated by the framing, in H1 that is only the
> case with chunked.

So, as a data point, the Python HTTP/2 implementations all strictly enforce content-length. This is an unappreciated benefit of H2. In HTTP/1.1 a message that does not match the content length is moderately tricky to detect: messages that are too short are indistinguishable from timeouts or other network weather, and messages that are too long require that the recipient deploy heuristics to check whether the next bytes are actually a new response or appear to be part of the previous representation. Most Python implementations don’t bother trying to detect that because it’s not worth spending the CPU time, so in practice messages that are too long are just truncated and a parsing error occurs if the connection is re-used.

In H2, however, there is a very clear way to police content-length: count the data bytes as they come in until END_STREAM. If they don’t match the content-length header that was provided, that’s an error: GOAWAY(PROTOCOL_ERROR). There is no excuse for H2 implementations to get request/response sizes wrong, especially as PHK points out that implementations that are not certain of body size can simply strip the C-L header.

So I don’t really consider this a new class of problems, so much as H2 giving us the solution to this problem. In my view, for H2 implementations should either treat the content-length is an absolute right-or-wrong, or just ignore it completely. Either works, and implementations may decide which they prefer. I suspect most will go for the former (except browser UAs, who are generally speaking inclined to tolerate most misbehaviour if possible).

Received on Wednesday, 15 February 2017 10:38:04 UTC