- From: Hallvord R. M. Steen <hallvord@opera.com>
- Date: Fri, 12 Oct 2012 11:12:07 +0000
- To: Julian Aubourg <j@ubourg.net>
- Cc: public-webapps@w3.org
[Editorial note: I respond to two E-mails by Julian at once, and I
have done some re-ordering/interleaving to keep discussion of related
points together. This editing was in no way meant to reduce the
coherence of Julian's arguments, I hope it doesn't. The discussion of
rationale is "hoisted" to the start of this E-mail, threat and abuse
risk later.]
>>> If it's a mistake on the backend's side (they filter out while
>>> they didn't intend to)
>>> just contact the backend's maintainer and have them fix this server-side
>>> problem... well... server-side.
To add to what Mike said - I've been doing what you propose as part of
my job for about ten years now. You might think that sniffing is done
on purpose to serve different browsers suitable content, and you might
think broken sniffing on the backend is mainly a problem with
un-maintained legacy sites. Per my experience, I'd say both
assumptions are mistaken.
Consider the household name that is one of the web's major E-commerce
destinations. I've seen it remove random chunks of required markup
because of obscure backend browser sniffing. I've seen it remove *the
contents of JavaScript variables* inside inline SCRIPT tags - instead
of <script>var obj={ /*several lines of methods and
properties*/};</script> it sent <script>var obj={};</script>! I've
seen it drop random stylesheets and jumble the layout.
The best part? For nearly 10 years, said site has been promising to
fix the problems! They always plan to get it done in the next quarter
or something. And this is clearly *not* an unmaintained backend - it's
a site that must be spending hundred thousands of dollars on backend
maintenance and development per year. Seems there just isn't any
single developer who has sufficient overview of and responsibility for
their browser detection policies to actually do something about it.
Given such experiences, I don't consider it unlikely that a site would
intend to enable CORS, make the required backend changes to share some
of its content, but fail entirely to "go that extra mile" and fix the
sniffing. In fact, I think it would be *more* likely that CORS is
bolted-on without any browser detection-related changes and testing.
And you expect that when you contact a site saying "Hello, I made a
neat CORS news aggregator, it's used by 10 000 people, but the 250
Opera users among them have trouble because your site detects "Opera"
and gzips content twice, making my app show them binary garbage on
screen" - they would care? You really think so? Great, we're hiring
browser evangelists (and I'm pretty sure Mozilla is too) so we need
crazy optimists to apply :-)
> The problem is that the same reasoning can be made regarding CORS.
This is not in fact a problem ;-). Both general CORS security policy
and this is a judgement call. It's not a problem if we resolve them
differently - we just need to weigh the pros and cons to see which
side of the argument is stronger.
> If we had a mechanism to do the same thing for the fact of modifying the
> UserAgent header, I wouldn't even discuss the issue.
For CORS usage, we have that already: the Access-Control-Allow-Headers
response header. So as Anne pointed out, this is already opt-in for
CORS. Per your comment above we should hence focus our discussion on
potential threat scenarios for same-domain usage (or agree to allow
setting User-Agent for CORS but not local requests?)
Moving on to discussion of proposed threat scenarios, first DOS:
Glenn Maynard <glenn@zewt.org> wrote:
>> Are you really saying that backend developers want to use User-Agent to
>> limit the number of requests accepted from Firefox?
>> (Not one user's Firefox, but all Firefox
>> users, at least of a particular version, combined.)
To which Julian responded:
> A more likely scenario
So we are in agreement that the "use User-Agent header's value to
prevent DOS" scenario is unlikely? :-) (More on the alternate scenario
later).
>>> Now, read back your example but suppose the attack is to be pulled
>>> against cnn.com. At a given time (say cnn.com's peek usage time), the
>>> script issues a gazillions requests. Bye-bye server.
This is already possible, since XMLHttpRequest (and event
script-generated IMG or script-based form submits) exist. The question
we're trying to figure out is whether being able to change User-Agent
causes *greater* risk. I hope we can agree that the server's measures
against DOS are unlikely to be based on User-Agent and, that if they
were, a script changing User-Agent would make those requests *simpler*
to reject, not harder. (I.e. through a "User-Agent of incoming request
does not match User-Agent associated with cookie session - reject!"
logic).
>> I'm confused. What does this have to do with unblacklisting the
>> User-Agent header?
>>
>>> That's why I took the ad example. Hack a single point of failure (the ad
>>> server, a CDN) and you can DOS a site using the resource from network
>>> points all over the net. While the frontend dev is free to use scripts
>>> hosted on third-parties, the backend dev is free to add a (silly but
>>> effective) means to limit the number of requests accepted from a browser.
>>> Simple problem, simple solution and the spec makes it possible.
Only if you assume that a web site will say "We're being DOS-attacked
- quick, stop accepting requests from MSIE!". This would certainly be
even more of a nuisance to their visitors than the attack itself, so
as a strategy against DOS it would make little sense.
> A more likely scenario is a URL that only accepts a specific user agent
> that is not a browser (backend). If user script can change the UserAgent,
> it can request this URL repeatedly. Given it's in the browser, a shared
> resource (like an ad provider or a CDN) becomes a very tempting point of
> failure.
>
> AFAIK, you don't have the same problem with PHP libs for instance (you
> don't request same from a third-party server, making it a potential vector
> of attack).
It seems we're to some extent mixing two threat scenarios here: the
"DOS" and the "unauthorized access" ones. Repeated requests is more a
DOS-type problem, secret backend URLs with User-Agent filtering is an
unauthorized access problem. Let's discuss them separately.
If the threat is DOS-type attacks, using a secret URL (thereby giving
away your knowledge of that secret URL and the token to access it to
technical staff analysing and stopping the attack) would only make
sense, compared to accessing a public one, if the secret URL did a lot
more heavy lifting so that the site could be taken down with fewer
requests. If I were a hacker, I would rather use a botnet or something
similar for this purpose, because it would make it harder to detect
that I knew the secret URL and its token..
Unauthorized access might be worse from a browser than a shell script
if the JS could make use of information in the browser (e.g. session
cookies) to run an effective attack against the secret URL. I would
however assume that a part of the backend dealing with session cookies
would not generally be limiting itself to requests from other
*servers*, they would presumably not have sessions created.
At this point we're making a lot of hypothetical assumptions, though..
To make things a bit more real I've tried to find examples of real
user-agent detection and filtering in backend scripts. I've checked
PayPal's Instant Payment Notification backend (a PHP script PayPal
provides that will live in a secret location on your site and receive
a POST from PayPal with information when a payment is successfully
made). It is an example of a secret backend script a hacker would have
financial/practical motivations for attacking. It does not make any
attempt at checking the User-Agent string to see if it is indeed being
contacted by PayPal's server.
There are several examples of filtering scripts for blacklisting
User-Agents, for example this Bad Behaviour Wordpress plugin:
http://code.ohloh.net/project?pid=f1ZpDuUCZw8&browser=Default&did=bad_behavior%2Fpublic_html&cid=kcopgYkVDm4
This does however not have any implications for our reasoning or
decision. It would make no sense for a script to set User-Agent to opt
into being blacklisted.
Does anyone on public-webapps know an example of a backend that uses
white-listing UAs as a security measure? If anyone has actually seen
or implemented this I'd love to hear about it...provided anyone reads
this far. I didn't actually expect to generate so much discussion here
;-)
-Hallvord
Received on Friday, 12 October 2012 11:12:39 UTC