Re: Javascript Security for Dummies from Mark Birbeck on 2010-03-04 (public-rdfa-wg@w3.org from March 2010)

From: Mark Birbeck <mark.birbeck@webbackplane.com>
Date: Thu, 4 Mar 2010 20:55:25 +0000
To: Toby Inkster <tai@g5n.co.uk>
Cc: RDFa WG <public-rdfa-wg@w3.org>
Message-ID: <640dd5061003041255n5b4c8141gb05708259768c780@mail.gmail.com>
Hi Toby,

Whilst nothing you say is untrue, it would be remiss of us not to
balance your points.

First, we should point out that any service that documents itself as
returning JSON is actually responding to JSONP requests (for the
reasons you describe). So the risks that you're presenting also apply
to the use of anything from the Flickr API through to the Twitter API,
as well as the myriad SPARQL end-points that return 'JSON' (i.e.,
JSONP).

Second, we should also explain that the risk you outline applies
equally to the use of JS libraries such as Yahoo!'s YUI, which is
hosted on their servers. Likewise, anyone using the Google Maps API in
their blog posts runs exactly the same risk that you refer to.

None of which is to say that there is no risk -- it's simply that in
the interests of balance I'm pointing out that the risk people run if
we implement profiles and/or vocabularies using JSONP is as high or
low as the risk people run from using a ton of other services and
techniques on the modern internet.

Regards,

Mark

On Thu, Mar 4, 2010 at 7:37 PM, Toby Inkster <tai@g5n.co.uk> wrote:
> Apologies for cutting out of the telecon a few minutes early. As I left
> I volunteered to write a quick summary of how Javascript/JSON/JSONP/CORS
> relate to each other and the various security issues involved.
>
> Let's start with the basics: Javascript (more properly known as
> ECMAScript these days) is a scripting language with various
> implementations, the best known being the ones that are embedded in most
> modern, graphical browsers. Javascript, as implemented in browsers, runs
> in a sandbox to prevent maliciously crafted web pages from doing any
> damage to the visitor's machine.
>
> A good number of years ago, Microsoft implemented a proprietary
> extension to Javascript which allowed scripts to perform HTTP requests
> and make use of the responses. Mozilla implemented some fairly similar
> functionality, and eventually other browsers followed (using the Mozilla
> syntax). This feature is called XmlHttpRequest or XHR - it's a bit of a
> misnomer as it's not restricted to retrieving XML.
>
> For security reasons, XHR requests are only allowed to be performed to
> URLs on the same domain as the script itself was loaded from. This is a
> good thing because you don't want http://evil.example/badpage.html to be
> able to perform an XHR to <http://bank.example/account-statement.html>
> especially given that the XHR would be sent with all the applicable
> cookies in the browser's cookie jar.
>
> (Aside: technically it's not cross-domain requests that are disallowed,
> but cross-origin requests. An origin is a slightly wooly concept:
> foo.example.com and bar.example.com are considered to be the same
> origin, foo.co.uk and bar.co.uk are different origins, despite the fact
> that from a DNS viewpoint, they're both third-level domain names. This
> is all being standardised currently.)
>
> Douglas Crockford "discovered" JSON. He maintains that he didn't invent
> it, just realise that Javascript contained a useful bit of syntax that
> could be standardised on. JSON is a restricted subset of Javascript's
> notation for objects and arrays. (JSON = Javascript Object Notation.)
> JSON is a data format that allows strings, numbers, booleans, arrays and
> associative arrays to be represented. In many ways it can be considered
> a competitor to XML. It's also pretty similar to YAML (though the
> oft-quoted statement that it's a subset of YAML is an urban myth).
> Here's how a person might be represented in JSON:
>
>        {
>                "name"     : "Toby Inkster" ,
>                "homepage" : "http://tobyinkster.co.uk/" ,
>                "mbox"     : "mailto:mail@tobyinkster.co.uk"
>        }
>
> Getting back to XHR, often people want to be able to request data from
> other origins, circumventing the same-origin policy enforced by
> browsers. With a little bit of extra syntax, JSON can be useful for
> this. This extra syntax is called JSONP. (JSONP = JSON plus Payload.)
>
> The way that JSONP works is that instead of supplying a JSON response,
> the server responding with the data sends a Javascript response, like
> this (usually the name of the callback function is configurable as a
> query string):
>
>        callback_function({
>                "name"     : "Toby Inkster" ,
>                "homepage" : "http://tobyinkster.co.uk/" ,
>                "mbox"     : "mailto:mail@tobyinkster.co.uk"
>        });
>
> How does this circumvent XHR's same-origin policy? Answer: it doesn't.
> But it eliminates the need to use XHR at all. The page requesting the
> data doesn't need to perform an XHR request, it just defines a function
> called callback_function to deal with the data, then it loads the JSONP
> file using a standard HTML <script src> element. The browser downloads
> and executes the script, and calls the function with the data as a
> parameter.
>
> However, this opens up a big security hole. Suppose that the server
> supplying a JSONP response is compromised, or its owner just decides to
> turn to the dark side. The server can send arbitrary Javascript (i.e.
> not JSONP) and the browser will execute it unquestioningly. This
> Javascript could be used to steal cookies, passwords and other
> privileged information from the page it was included in. Not nice.
>
> CORS is another way around the same-origin policy, but this time it's
> not a hack. It's a set of HTTP headers that a URL can respond with to
> indicate that it's safe to be retrieved in cross-origin requests. So if,
> say, http://bank.example/homepage.html contains no private data and is
> perfectly safe for other sites to have access to, then it could set a
> CORS header to allow http://evil.example/badpage.html to try its worst.
> http://bank.example/account-statement.html wouldn't send the CORS header
> so would be protected by the default same-origin policy.
>
> So how does this apply to RDFa vocabularies/profiles?
>
> If vocabularies are hosted on a separate server to the pages making use
> of them, then Javascript implementations of RDFa would need to make a
> cross-origin request to read them. (Actually they could make a
> same-origin request to a proxying script, but that's not an especially
> elegant solution.)
>
> If we want such Javascript implementations of RDFa to be possible, this
> allows two solutions:
>
>        1. Serve up the vocabulary document as JSONP; or
>        2. Serve it up as something else plus CORS headers.
>
> #1 is problematic because as I said, JSONP is not nice, safe JSON,
> despite the similar names. JSONP is Javascript.
>
> #2 is problematic because CORS is a very new feature. Many of the newest
> browsers support it (including IE8), but if you want your script to work
> in downlevel browsers, this is not your solution.
>
> In my next message I'll outline how my RDFa vocab proposal (which is
> slightly different to Manu's) makes this a moot point by saying that
> retrieval of the vocab document is optional - a SHOULD requirement
> rather than a MUST - and provides a fallback in the case, e.g. of
> browsers which don't implement CORS.
>
> --
> Toby A Inkster
> <mailto:mail@tobyinkster.co.uk>
> <http://tobyinkster.co.uk>
>
>
>
Received on Thursday, 4 March 2010 20:56:04 UTC