Re: what constitutes an "invalid" content-length

Looks like this turns out to be a red herring sorry all.

improper header wrapping / bare linefeed in a response header value was 
pushing our header byte count out by 2 bytes.



------ Original Message ------
From: "Alex Rousskov" <rousskov@measurement-factory.com>
To: "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Cc: "Adrien de Croy" <adrien@qbik.com>
Sent: 13/07/2016 4:20:29 a.m.
Subject: Re: what constitutes an "invalid" content-length

>On 07/12/2016 07:31 AM, Adrien de Croy wrote:
>
>>  just dealing with a site that sends more payload data than is 
>>indicated
>>  in the Content-Length header.
>
>From the standards point of view, that is _not_ what you are dealing
>with. You are dealing with a site that sends two responses, the first
>response is proper HTTP. The second response is garbage.
>
>
>>  RFC7230 sections 3.3.2 (Content-Length), 3.3.3 (Message body length),
>>  and 3.3.4 (Handling incomplete messages) only contemplate issues 
>>around
>>  Content-Length specifying more bytes than are received, not fewer.
>
>From the standards point of view, it is impossible for the
>Content-Length to specify fewer bytes than the message has. Irrelevant
>for this discussion cases aside, the message end is defined by the
>Content-Length header value. One cannot have more than what was 
>promised
>because one stops assembling the message [body] after the promised
>number of bytes were added. Any "leftovers" are another message or
>garbage, depending on Connection:close, pipelining, and similar 
>factors.
>
>
>>  I guess one could argue that a wrong C-L value is "invalid", but it's
>>  not clear that invalid in this context simply means it doesn't parse, 
>>or
>>  is otherwise non-compliant with the ABNF.
>
>It is valid from protocol point of view. You know it is "wrong" only
>because you can (or you think you can) distinguish garbage from the end
>of the content.
>
>
>>  So, it's not clear what the browser and/or proxy response should be.
>
>There is no single right answer to that. A compliant client (including
>proxies) ought to treat leftovers as post-message gardbage or another
>message. A real-world client may identify specific cases where 
>leftovers
>are likely to be the end of the message content and ignore
>Content-Length in those cases. The cases where such behavior would be a
>good idea would vary from agent to agent, from one deployment to 
>another.
>
>
>>  I would expect it's in everyone's best interest if sites that have
>>  broken framing are forced to be fixed.  This won't happen if browsers
>>  "just work" for the site.
>
>The ever-popular "force sites to be fixed" approach rarely fixes enough
>real-word sites to remove special treatment code. See Patrick's 
>response
>for a good illustration.
>
>
>>  Is there a special behaviour we should agree on for such cases?
>
>We could agree to violate the standard in one or two special cases, but
>any formal agreement would probably result in a few more broken sites
>because more folks will tolerate them, decreasing the probability that
>they will be fixed.
>
>I can think of one special case where it is more-or-less safe to ignore
>response Content-Length:
>
>* the HTTP/1 connection is not persistent,
>* no additional outstanding pipelined requests on that connection,
>* the unique Content-Length header field is syntactically valid, and
>* more bytes were read during the last network read than C-L promises.
>
>The combination of these conditions can trigger [optional] "robustness"
>code that reads until connection closure and re-sends leftovers/garbage
>to the next hop (or displays it to the user), opening a message
>smuggling attack vector.
>
>Needless to say, there are benign leftover cases that the above
>conditions do not cover.
>
>
>Cheers,
>
>Alex.
>

Received on Tuesday, 12 July 2016 22:39:29 UTC