- From: Amos Jeffries <squid3@treenet.co.nz>
- Date: Mon, 30 Jan 2012 13:17:32 +1300
- To: ietf-http-wg@w3.org
On 27/01/2012 8:36 a.m., Willy Tarreau wrote: > > (...) >>>> When a server listening only for HTTP request messages, or processing >>>> what appears from the start-line to be an HTTP request message, >>>> receives a sequence of octets that does not match the HTTP-message >>> Wouldn't "does not *exactly* match" be better ? I'm used to find >>> crappy requests in my logs which are blocked but which some not-so-lazy >>> implementations would let pass (eg: multiple SP). >> "match" means "match"; I don't think there's any ambiguity here... > There's no ambiguity, it's just to emphasize on the need to perform > strict matching. A large number of HTTP parsers are much too lazy, > causing nightmares when trying to filter undesired communications, > or even to define new protocol extensions. For instance on my old > Apache 1.3 here : > > $ telnet www 60080 > Connected to www. > Escape character is '^]'. > HEAD / HTTP/1.1 ergeargoaejgoiejgaoeg > Host: ,,,, > Invalid/header name: blah > > HTTP/1.1 200 OK > Date: Thu, 26 Jan 2012 19:07:02 GMT > Server: Apache > Last-Modified: Mon, 01 Jun 2009 16:47:12 GMT > ETag: "47038-3ad7-46b4c2d81a400" > Accept-Ranges: bytes > Content-Length: 15063 > Connection: close > Content-Type: text/html > > Connection closed by foreign host. > > "SP" is *one* SP, still multiple SPs are accepted in the request > line. Same for forbidden chars in the header name. And I'm not > specifically targeting Apache here, I just took the first example > I had handy, it's far from being alone. It looks like strchr(), > strtok(), sscanf() or split() depending on the language and > implementation are common ways to parse requests. This is part > of what caused all the mess in the hybi WG, delaying it by one > year trying to find solutions against various implementations. FWIW: we argued this out in Squid a while back. The conclusion was to accept any series of non-wrapping BWS before/after the method and URL. Ignoring the BWS. All other formats and garbage to be treated as HTTP/0.9 mess and 400 the result if the suspected URL(+garbage) fails to parse as a usable URI in its entirety. A few vendors have hit it with their SP padding practices so far. But by and large it works. AYJ
Received on Monday, 30 January 2012 00:18:06 UTC