- From: Amos Jeffries <squid3@treenet.co.nz>
- Date: Sat, 23 Feb 2013 23:50:40 +1300
- To: ietf-http-wg@w3.org
On 23/02/2013 7:49 a.m., Nicolas Mailhot wrote: > Amos Jeffries <squid3@...> writes: > >> Client, middlware, and routing infrastructure do not need to care about >> the path+query portion for their operations other than as an opaque >> blob. > Unfortunately not true. We had cases where misbehaving users (that *knew* they > were misbehaving) changed dynamically the name of the accessed host, and the > only way to stop the damage was a path match (which fortunately was > discriminating). Please explain in more detail. How did they dynamically change the accessed host? And why did your HTTP middleware allow the change? If you are talking about fiddling the Host: header versus absolute-URL versus TCP destination and similar vulnerabilities in the middleware there is no excuse for it being vulnerable. I point you at Squid-3.2 and the way we prohibit Host and TCP address differing - to the point or marking players like Google and Akamai regularly as "forgers". > And a lot of botnet attacks can be identified by the access to a special path, > which is the same on all infected servers users access to. You seem to be misunderstanding the meaning of "opaque". It has nothing to do with obscuring anything. Botnet requests with a consistent ETag prefix for example would be equally detectable and preventable using also the same method: a pattern match against the relevant field. I posit that what you are doing there is that _you_ (the human) are reading the blob following a URL hostname, _you_ are understanding it, and writing a tool that detects a pattern in that field-value. The tool itself only needs to determine if the field as a whole matches the pattern you gave it. If those same botnets were sending urn: with hostname and a path segment you would just as easily identify the pattern and have tools matching it - even though the "path" segment of URN is an opaque blob everywhere except the origin. There are cases where middleware does need to manipulate the path. But these are also the cases where you would be parsing it completely anyway, to gain full understanding of all the pieces inside it right down to the byte level. That would always be done with a parser which re-assembled the pieces and assigned specific meaning to each byte - including the query portion. > > In all those cases the query portion is just garbage to be ignored, the path – > not. This tells me you have not encountered (or noticed) the Spam attacks involving query-injection last decade. Lucky you. /history/ Now dead versions of Outlook used to make magic hyperlinks links on any http:// text it detected in plain text by hiding all the text it decided was URL and showing only the domain name. The attacker would carefully craft login credentials containing encoded @, / and ? in ways which outlook would mistake as delimiters but the browser would decode before parsing the URL. User and anyone not clued up enough to notice would see as a link to a victum website, example.com (or in some cases localhost!) which when clicked would go completely sidways to a Phishing or viral infected URL on a host somewhere else. > 'Do not need to care' is another word for 'no creative users' No. 'Do not need to care' is another word for 'I already have a better way to detect those creative users'. In particular, As has been mentioned already. Splitting those two fields will simply give those creative users another tool to play with while making the middleware do more work to prevent them using it, um, creatively. Amos
Received on Saturday, 23 February 2013 10:51:20 UTC