- From: Adrien de Croy <adrien@qbik.com>
- Date: Tue, 12 Jul 2016 22:38:47 +0000
- To: "Alex Rousskov" <rousskov@measurement-factory.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Looks like this turns out to be a red herring sorry all. improper header wrapping / bare linefeed in a response header value was pushing our header byte count out by 2 bytes. ------ Original Message ------ From: "Alex Rousskov" <rousskov@measurement-factory.com> To: "ietf-http-wg@w3.org" <ietf-http-wg@w3.org> Cc: "Adrien de Croy" <adrien@qbik.com> Sent: 13/07/2016 4:20:29 a.m. Subject: Re: what constitutes an "invalid" content-length >On 07/12/2016 07:31 AM, Adrien de Croy wrote: > >> just dealing with a site that sends more payload data than is >>indicated >> in the Content-Length header. > >From the standards point of view, that is _not_ what you are dealing >with. You are dealing with a site that sends two responses, the first >response is proper HTTP. The second response is garbage. > > >> RFC7230 sections 3.3.2 (Content-Length), 3.3.3 (Message body length), >> and 3.3.4 (Handling incomplete messages) only contemplate issues >>around >> Content-Length specifying more bytes than are received, not fewer. > >From the standards point of view, it is impossible for the >Content-Length to specify fewer bytes than the message has. Irrelevant >for this discussion cases aside, the message end is defined by the >Content-Length header value. One cannot have more than what was >promised >because one stops assembling the message [body] after the promised >number of bytes were added. Any "leftovers" are another message or >garbage, depending on Connection:close, pipelining, and similar >factors. > > >> I guess one could argue that a wrong C-L value is "invalid", but it's >> not clear that invalid in this context simply means it doesn't parse, >>or >> is otherwise non-compliant with the ABNF. > >It is valid from protocol point of view. You know it is "wrong" only >because you can (or you think you can) distinguish garbage from the end >of the content. > > >> So, it's not clear what the browser and/or proxy response should be. > >There is no single right answer to that. A compliant client (including >proxies) ought to treat leftovers as post-message gardbage or another >message. A real-world client may identify specific cases where >leftovers >are likely to be the end of the message content and ignore >Content-Length in those cases. The cases where such behavior would be a >good idea would vary from agent to agent, from one deployment to >another. > > >> I would expect it's in everyone's best interest if sites that have >> broken framing are forced to be fixed. This won't happen if browsers >> "just work" for the site. > >The ever-popular "force sites to be fixed" approach rarely fixes enough >real-word sites to remove special treatment code. See Patrick's >response >for a good illustration. > > >> Is there a special behaviour we should agree on for such cases? > >We could agree to violate the standard in one or two special cases, but >any formal agreement would probably result in a few more broken sites >because more folks will tolerate them, decreasing the probability that >they will be fixed. > >I can think of one special case where it is more-or-less safe to ignore >response Content-Length: > >* the HTTP/1 connection is not persistent, >* no additional outstanding pipelined requests on that connection, >* the unique Content-Length header field is syntactically valid, and >* more bytes were read during the last network read than C-L promises. > >The combination of these conditions can trigger [optional] "robustness" >code that reads until connection closure and re-sends leftovers/garbage >to the next hop (or displays it to the user), opening a message >smuggling attack vector. > >Needless to say, there are benign leftover cases that the above >conditions do not cover. > > >Cheers, > >Alex. >
Received on Tuesday, 12 July 2016 22:39:29 UTC