Fwd: [XHR] Open issue: allow setting User-Agent? from Julian Aubourg on 2012-10-11 (public-webapps@w3.org from October to December 2012)

From: Julian Aubourg <j@ubourg.net>
Date: Thu, 11 Oct 2012 15:09:07 +0200
To: public-webapps@w3.org
Message-ID: <CANUEoeuVFTs1Gg0Dx2u2miRPryFdGspxCfqeVG++vp5uKtfu9A@mail.gmail.com>
Sorry, I've been cut by keyboard short cuts :P

... so the burden of proof is on *you*. *You* have to establish the
consequences of making a backward incompatible change. Not brush away
arguments pro, or cons, to advance your agenda. Did you ask backend devs
why they white-listed browsers? Did you try and educate them? Did you ever
encounter any sensible use-case for this? Do you really want to break a lot
of backends expectations because you "don't see the reason"?

You have to be very careful with breaking backward compatibility. Just look
the jQuery's bug tracker for a prime example of what happens when you do.

We don't have to prove it is useful. We just have to prove it is used and
*you* brought this up yourself. Now you want to bypass this by pretty much
hacking client-side. Please make a compelling case for it.

> I still don't fully understand the scenario(s) you have in mind.

You're confusing the script's origin with the site's origin. XHR requests
from within a script are issued with the origin of the page that the script
is included into.

Now, read back your example but suppose the attack is to be pulled against
cnn.com. At a given time (say cnn.com's peek usage time), the script issues
a gazillions requests. Bye-bye server.

That's why I took the ad example. Hack a single point of failure (the ad
server, a CDN) and you can DOS a site using the resource from network
points all over the net. While the frontend dev is free to use scripts
hosted on third-parties, the backend dev is free to add a (silly but
effective) means to limit the number of requests accepted from a browser.
Simple problem, simple solution and the spec makes it possible.

Note that this use-case has nothing to do with filtering out a specific
browser btw. Yet you would break this with the change you propose.

Maybe it's not the best of examples. But I came up with this in something
like 5 minutes. I can't imagine there are no other ways to abuse this.

> This is a way more interesting (ab)use case. You're presuming that there
are web-exposed backend
> services that are configured to only talk to other backend servers, and
use a particular magic token
> in User-Agent as authentication? If such services exist, does being able
to send a "server-like" UA
> from a web browser make them significantly more vulnerable than being
able to send the same string
> from a shell script?

Same as above: single point of failure. You hack into a server delivering a
shared resource and you have as many unwilling "agents" participating into
your attack.

So far I see that only Jaredd seems to like the idea (in this thread
anyway):

> I agree with Hallvord, I cannot think of any additional *real* security
risk involved with setting the
> User-Agent header.  Particularly in a CORS situation, the server-side
will (should) already be
> authenticating the origin and request headers accordingly.  If there
truly is a compelling case for
> a server to only serve to Browser XYZ that is within scope of the open
web platform, I'd really like to
> hear tha

By that line of reasoning, I don't see why we need preflight in CORS and
specific authorisation from the server-side for content to be delivered
cross-domain. It is not *open*. After all since any backend could request
the resource without problem, why should browsers be limited?

But then again, the problem has nothing to do with CORS but with
third-party scripts that effectively steal the origin of the page that
includes them and the single point of failure problem that arises. That's
why JavaScript is as sandboxed as it is.

In all honesty, I'd love to be convinced that the change is without
consequences, but the more I think about it, the less likely it seems.

---------- Forwarded message ----------
From: Julian Aubourg <j@ubourg.net>
Date: 11 October 2012 14:47
Subject: Re: [XHR] Open issue: allow setting User-Agent?
To: "Hallvord R. M. Steen" <hallvord@opera.com>



We end up in a philosophical disagreement here :-) I'd say that whatever
> browser the user decides to use is the user's choice and the server should
> respect that.


I'm sorry but that's complete non-sense. The backend is the provider of the
data and has all the right when it comes to its distribution. If it's a
mistake on the backend's side (they filter out while they didn't intend to)
just contact the backend's maintainer and have them fix this server-side
problem... well... server-side.

You're trying to circumvent a faulty implementation server-side by breaking
a client-side related spec backward compatibility. If you can't see how
wrong the whole idea is, I'm afraid you didn't have to suffer the
consequences of such drastic changes in the past (I had to with script tag
injection and it was a just a pure client-side issue, nothing close to what
you're suggesting in term of repercussions).


>
> One word: legacy. For example Amazon.com might want to enable CORS for
> some of its content. The team that will do that won't necessarily have any
> intention of blocking browsers, but will very likely be unaware of the
> widespread browser sniffing in other parts of the Amazon backend. (With
> sites of Amazon's or eBay's scale, there is in my experience simply no
> single person who is aware of all browser detection and policies). Hence,
> there is IMO non-negligible risk that a large web service will be
> "cooperative" on CORS but still shoot itself in the foot with browser
> sniffing.
>
> If I write, say, a CORS content aggregator, I would want it to run in all
> browsers, not only those allowed by the content providers. And I'd want to
> be in control of that. Hence, in my view this issue is mostly a trade-off
> between something script authors may need and more theoretical purity
> concerns.
>
>
First of all, my point was that the backend should be in control of its
content distribution, not some client-side javascript, and that it was
exactly in that spirit that the CORS spec had been written.

That being said...

You're shooting yourself in the foot here. It's because of legacy, hardly
(if ever) maintained backends that we shouldn't change this point of the
spec. What you're saying here is that people on the backend will adapt a
part of their site for CORS, realise there is filtering some place else yet
not try and fix the issue? And a client-side spec should be changed as a
consequence of this unbelievably lazy behaviour because...? They have to
make changes because of CORS anyway, let them go the extra mile.

Please leave unmaintained backends alone. You can expect your server's
behaviour to change if you change, say, the version of PHP you're using.
But to see it change because some client-side spec broke backward
compatibility? Did you write server-side code in the past?

I don't like browser sniffing any more than you do but I acknowledge that
it exists and is supported by the spec as a viable means to filter in/out
some or all browsers. From what I can tell from your use cases, you're
confronted to white listing which means the server intentionally guards
against unknown sources (stupidly but that's hardly the point).

As long as the spec doesn't change (and why should it?), the backend cannot
be tricked into delivering content by some javascript code in the browser
and I'm quite fine with it.


> Yes, this could give a generic library like jQuery less control of the
> contents of *its* request. However, there will still be plenty of requests
> not sent through XHR - the browser's main GET or POST for the actual page
> contents, all external files loaded with SCRIPT, LINK, IMG, IFRAME, EMBED
> or OBJECT, all images from CSS styling etc. Hence I still believe the
> information loss and effect on stats will be minimal.
>
> Also, the above could be a feature if I'm working on extending a site
> where I don't actually fully control the backend - think a CMS I'm forced
> to use and have to work around bugs in even if that means messing with how
> jQuery sends its requests ;-).
>
>
Point is you make it possible to rig the game pretty badly. It's not about
jQuery losing control (in fact the code snippet will set the user agent
string for all script-initiated requests, no matter the lib, even in pure
javascript): good luck with your usage analytics on single page apps,


>
>> Oh, I agree entirely. Except checking User-Agent is a quick and painless
>> means to protect against malicious JavaScript scripts. I don't like the
>> approach more than you do, but we both know it's used in the wild.
>>
>
> I'm afraid I don't know how this is used in the wild and don't fully
> understand your concerns. Unless you mean we should protect dodgy SEO
> tactics sending full site contents to Google bot UAs but a paywall block to
> anyone else from user-applied scripts trying to work around that?


The burden of proof is on you. *You* ha


>
>
>  A malicious ad script would presumably currently have the user's web
>>> browser's User-Agent sent with any requests it would make
>>>
>>
>  The malicious script can trick the server into accepting a request the
>> backend expects to be able to filter out by checking a header which the
>> standard says is set by the browser and cannot be changed by user scripts.
>> Think painless DOS with a simple piece of javascript.
>>
>
> I still don't fully understand the scenario(s) you have in mind.
>
> For a DOS attack you'd be launching it against some third-party site (it
> doesn't make sense for a site to DOS itself, right?). Trying to understand
> this, here are my assumptions:
>
> * The threat scenario is trying to DOS victim.example.com by getting a
> malicious javascript targetting this site to run on cnn.com or some
> similar high-volume site. (The attacker presumably needs to run the script
> on a high-volume site to be able to generate enough bogus requests for a
> successful DOS attack). This can be achieved for example by hacking into
> some service that delivers ads to cnn.com or in-transit modification of
> scripts requested by cnn.com (only for end users downstream of your
> network location).
>
> * The malicious script will be using XHR in an attempt to DOS
> victim.example.com (if it uses other ways to do it, it's outside the
> scope of what we're trying to decide)
>
> * The concern is whether allowing a custom User-Agent for XHR requests
> makes this scenario harder to defend against.
>
> * You're saying that victim.example.com may have a white-list of
> User-Agent strings as a security measure to avoid serving content in
> response to requests presumed to be malicious, and that this helps them
> avoid XHR-based DOS attempts.
>
> First observation is that victim.example.com needs to enable CORS for
> this attack venue to be possible in the first place. This to some extent
> limits the feasibility of the whole exercise (sites that are commonly
> targeted for DOS attacks are perhaps not likely to enable CORS - partly
> because it may make them more vulnerable to malice).
>
> Secondly, this attempted DOS attack uses in-browser JavaScript (again, if
> it uses any other method it's outside of our scope). Out of the box, all
> the requests will be sent with the browser's original User-Agent string. As
> we're launching our attack from end users' regular web browsers, there is a
> very high chance that the User-Agent string is already on
> victim.example.com's whitelist. Hence, the DOS script will probably be
> more successful if it does *not* set User-Agent.
>
> Why would setting User-Agent make the malicious script more effective at
> DOSing victim.example.com?
>
>
>  but the use-case is really to prevent browsers from mascarading as
>> servers).
>>
>
> This is a way more interesting (ab)use case. You're presuming that there
> are web-exposed backend services that are configured to only talk to other
> backend servers, and use a particular magic token in User-Agent as
> authentication? If such services exist, does being able to send a
> "server-like" UA from a web browser make them significantly more vulnerable
> than being able to send the same string from a shell script?
>
>
> --
> Hallvord R. M. Steen
> Core tester, Opera Software
>
Received on Thursday, 11 October 2012 13:09:36 UTC