- From: Hallvord R. M. Steen <hallvord@opera.com>
- Date: Fri, 12 Oct 2012 11:12:07 +0000
- To: Julian Aubourg <j@ubourg.net>
- Cc: public-webapps@w3.org
[Editorial note: I respond to two E-mails by Julian at once, and I have done some re-ordering/interleaving to keep discussion of related points together. This editing was in no way meant to reduce the coherence of Julian's arguments, I hope it doesn't. The discussion of rationale is "hoisted" to the start of this E-mail, threat and abuse risk later.] >>> If it's a mistake on the backend's side (they filter out while >>> they didn't intend to) >>> just contact the backend's maintainer and have them fix this server-side >>> problem... well... server-side. To add to what Mike said - I've been doing what you propose as part of my job for about ten years now. You might think that sniffing is done on purpose to serve different browsers suitable content, and you might think broken sniffing on the backend is mainly a problem with un-maintained legacy sites. Per my experience, I'd say both assumptions are mistaken. Consider the household name that is one of the web's major E-commerce destinations. I've seen it remove random chunks of required markup because of obscure backend browser sniffing. I've seen it remove *the contents of JavaScript variables* inside inline SCRIPT tags - instead of <script>var obj={ /*several lines of methods and properties*/};</script> it sent <script>var obj={};</script>! I've seen it drop random stylesheets and jumble the layout. The best part? For nearly 10 years, said site has been promising to fix the problems! They always plan to get it done in the next quarter or something. And this is clearly *not* an unmaintained backend - it's a site that must be spending hundred thousands of dollars on backend maintenance and development per year. Seems there just isn't any single developer who has sufficient overview of and responsibility for their browser detection policies to actually do something about it. Given such experiences, I don't consider it unlikely that a site would intend to enable CORS, make the required backend changes to share some of its content, but fail entirely to "go that extra mile" and fix the sniffing. In fact, I think it would be *more* likely that CORS is bolted-on without any browser detection-related changes and testing. And you expect that when you contact a site saying "Hello, I made a neat CORS news aggregator, it's used by 10 000 people, but the 250 Opera users among them have trouble because your site detects "Opera" and gzips content twice, making my app show them binary garbage on screen" - they would care? You really think so? Great, we're hiring browser evangelists (and I'm pretty sure Mozilla is too) so we need crazy optimists to apply :-) > The problem is that the same reasoning can be made regarding CORS. This is not in fact a problem ;-). Both general CORS security policy and this is a judgement call. It's not a problem if we resolve them differently - we just need to weigh the pros and cons to see which side of the argument is stronger. > If we had a mechanism to do the same thing for the fact of modifying the > UserAgent header, I wouldn't even discuss the issue. For CORS usage, we have that already: the Access-Control-Allow-Headers response header. So as Anne pointed out, this is already opt-in for CORS. Per your comment above we should hence focus our discussion on potential threat scenarios for same-domain usage (or agree to allow setting User-Agent for CORS but not local requests?) Moving on to discussion of proposed threat scenarios, first DOS: Glenn Maynard <glenn@zewt.org> wrote: >> Are you really saying that backend developers want to use User-Agent to >> limit the number of requests accepted from Firefox? >> (Not one user's Firefox, but all Firefox >> users, at least of a particular version, combined.) To which Julian responded: > A more likely scenario So we are in agreement that the "use User-Agent header's value to prevent DOS" scenario is unlikely? :-) (More on the alternate scenario later). >>> Now, read back your example but suppose the attack is to be pulled >>> against cnn.com. At a given time (say cnn.com's peek usage time), the >>> script issues a gazillions requests. Bye-bye server. This is already possible, since XMLHttpRequest (and event script-generated IMG or script-based form submits) exist. The question we're trying to figure out is whether being able to change User-Agent causes *greater* risk. I hope we can agree that the server's measures against DOS are unlikely to be based on User-Agent and, that if they were, a script changing User-Agent would make those requests *simpler* to reject, not harder. (I.e. through a "User-Agent of incoming request does not match User-Agent associated with cookie session - reject!" logic). >> I'm confused. What does this have to do with unblacklisting the >> User-Agent header? >> >>> That's why I took the ad example. Hack a single point of failure (the ad >>> server, a CDN) and you can DOS a site using the resource from network >>> points all over the net. While the frontend dev is free to use scripts >>> hosted on third-parties, the backend dev is free to add a (silly but >>> effective) means to limit the number of requests accepted from a browser. >>> Simple problem, simple solution and the spec makes it possible. Only if you assume that a web site will say "We're being DOS-attacked - quick, stop accepting requests from MSIE!". This would certainly be even more of a nuisance to their visitors than the attack itself, so as a strategy against DOS it would make little sense. > A more likely scenario is a URL that only accepts a specific user agent > that is not a browser (backend). If user script can change the UserAgent, > it can request this URL repeatedly. Given it's in the browser, a shared > resource (like an ad provider or a CDN) becomes a very tempting point of > failure. > > AFAIK, you don't have the same problem with PHP libs for instance (you > don't request same from a third-party server, making it a potential vector > of attack). It seems we're to some extent mixing two threat scenarios here: the "DOS" and the "unauthorized access" ones. Repeated requests is more a DOS-type problem, secret backend URLs with User-Agent filtering is an unauthorized access problem. Let's discuss them separately. If the threat is DOS-type attacks, using a secret URL (thereby giving away your knowledge of that secret URL and the token to access it to technical staff analysing and stopping the attack) would only make sense, compared to accessing a public one, if the secret URL did a lot more heavy lifting so that the site could be taken down with fewer requests. If I were a hacker, I would rather use a botnet or something similar for this purpose, because it would make it harder to detect that I knew the secret URL and its token.. Unauthorized access might be worse from a browser than a shell script if the JS could make use of information in the browser (e.g. session cookies) to run an effective attack against the secret URL. I would however assume that a part of the backend dealing with session cookies would not generally be limiting itself to requests from other *servers*, they would presumably not have sessions created. At this point we're making a lot of hypothetical assumptions, though.. To make things a bit more real I've tried to find examples of real user-agent detection and filtering in backend scripts. I've checked PayPal's Instant Payment Notification backend (a PHP script PayPal provides that will live in a secret location on your site and receive a POST from PayPal with information when a payment is successfully made). It is an example of a secret backend script a hacker would have financial/practical motivations for attacking. It does not make any attempt at checking the User-Agent string to see if it is indeed being contacted by PayPal's server. There are several examples of filtering scripts for blacklisting User-Agents, for example this Bad Behaviour Wordpress plugin: http://code.ohloh.net/project?pid=f1ZpDuUCZw8&browser=Default&did=bad_behavior%2Fpublic_html&cid=kcopgYkVDm4 This does however not have any implications for our reasoning or decision. It would make no sense for a script to set User-Agent to opt into being blacklisted. Does anyone on public-webapps know an example of a backend that uses white-listing UAs as a security measure? If anyone has actually seen or implemented this I'd love to hear about it...provided anyone reads this far. I didn't actually expect to generate so much discussion here ;-) -Hallvord
Received on Friday, 12 October 2012 11:12:39 UTC