W3C home > Mailing lists > Public > public-rdfa-wg@w3.org > March 2010

Re: Javascript Security for Dummies

From: Ivan Herman <ivan@w3.org>
Date: Fri, 05 Mar 2010 09:20:03 +0100
Message-ID: <4B90BEB3.3060808@w3.org>
To: Toby Inkster <tai@g5n.co.uk>
CC: RDFa WG <public-rdfa-wg@w3.org>
Toby,

first of all, thanks.

I do have a technical question, though, for further clarification. You
say that, if a data is defined through JSONP, what this means is that
the HTML file that starts up the whole process should have an additional
<script...> element to get that. How would that apply to an RDFa file?
After all, what we are discussing here is a Javascript implementation
that interprets an RDFa content; would that mean that the RDFa _author_
is supposed to add a <script...> element into his/her file instead of
using @profile (or @vocab)? And what happens then with XML formats that
do not have such <script> element in the first place?

Ivan

On 2010-3-4 20:37 , Toby Inkster wrote:
> Apologies for cutting out of the telecon a few minutes early. As I left
> I volunteered to write a quick summary of how Javascript/JSON/JSONP/CORS
> relate to each other and the various security issues involved.
> 
> Let's start with the basics: Javascript (more properly known as
> ECMAScript these days) is a scripting language with various
> implementations, the best known being the ones that are embedded in most
> modern, graphical browsers. Javascript, as implemented in browsers, runs
> in a sandbox to prevent maliciously crafted web pages from doing any
> damage to the visitor's machine.
> 
> A good number of years ago, Microsoft implemented a proprietary
> extension to Javascript which allowed scripts to perform HTTP requests
> and make use of the responses. Mozilla implemented some fairly similar
> functionality, and eventually other browsers followed (using the Mozilla
> syntax). This feature is called XmlHttpRequest or XHR - it's a bit of a
> misnomer as it's not restricted to retrieving XML.
> 
> For security reasons, XHR requests are only allowed to be performed to
> URLs on the same domain as the script itself was loaded from. This is a
> good thing because you don't want http://evil.example/badpage.html to be
> able to perform an XHR to <http://bank.example/account-statement.html>
> especially given that the XHR would be sent with all the applicable
> cookies in the browser's cookie jar.
> 
> (Aside: technically it's not cross-domain requests that are disallowed,
> but cross-origin requests. An origin is a slightly wooly concept:
> foo.example.com and bar.example.com are considered to be the same
> origin, foo.co.uk and bar.co.uk are different origins, despite the fact
> that from a DNS viewpoint, they're both third-level domain names. This
> is all being standardised currently.)
> 
> Douglas Crockford "discovered" JSON. He maintains that he didn't invent
> it, just realise that Javascript contained a useful bit of syntax that
> could be standardised on. JSON is a restricted subset of Javascript's
> notation for objects and arrays. (JSON = Javascript Object Notation.)
> JSON is a data format that allows strings, numbers, booleans, arrays and
> associative arrays to be represented. In many ways it can be considered
> a competitor to XML. It's also pretty similar to YAML (though the
> oft-quoted statement that it's a subset of YAML is an urban myth).
> Here's how a person might be represented in JSON:
> 
> 	{
> 		"name"     : "Toby Inkster" ,
> 		"homepage" : "http://tobyinkster.co.uk/" ,
> 		"mbox"     : "mailto:mail@tobyinkster.co.uk"
> 	}
> 
> Getting back to XHR, often people want to be able to request data from
> other origins, circumventing the same-origin policy enforced by
> browsers. With a little bit of extra syntax, JSON can be useful for
> this. This extra syntax is called JSONP. (JSONP = JSON plus Payload.)
> 
> The way that JSONP works is that instead of supplying a JSON response,
> the server responding with the data sends a Javascript response, like
> this (usually the name of the callback function is configurable as a
> query string):
> 
> 	callback_function({
> 		"name"     : "Toby Inkster" ,
> 		"homepage" : "http://tobyinkster.co.uk/" ,
> 		"mbox"     : "mailto:mail@tobyinkster.co.uk"
> 	});
> 
> How does this circumvent XHR's same-origin policy? Answer: it doesn't.
> But it eliminates the need to use XHR at all. The page requesting the
> data doesn't need to perform an XHR request, it just defines a function
> called callback_function to deal with the data, then it loads the JSONP
> file using a standard HTML <script src> element. The browser downloads
> and executes the script, and calls the function with the data as a
> parameter.
> 
> However, this opens up a big security hole. Suppose that the server
> supplying a JSONP response is compromised, or its owner just decides to
> turn to the dark side. The server can send arbitrary Javascript (i.e.
> not JSONP) and the browser will execute it unquestioningly. This
> Javascript could be used to steal cookies, passwords and other
> privileged information from the page it was included in. Not nice.
> 
> CORS is another way around the same-origin policy, but this time it's
> not a hack. It's a set of HTTP headers that a URL can respond with to
> indicate that it's safe to be retrieved in cross-origin requests. So if,
> say, http://bank.example/homepage.html contains no private data and is
> perfectly safe for other sites to have access to, then it could set a
> CORS header to allow http://evil.example/badpage.html to try its worst.
> http://bank.example/account-statement.html wouldn't send the CORS header
> so would be protected by the default same-origin policy.
> 
> So how does this apply to RDFa vocabularies/profiles?
> 
> If vocabularies are hosted on a separate server to the pages making use
> of them, then Javascript implementations of RDFa would need to make a
> cross-origin request to read them. (Actually they could make a
> same-origin request to a proxying script, but that's not an especially
> elegant solution.)
> 
> If we want such Javascript implementations of RDFa to be possible, this
> allows two solutions:
> 
> 	1. Serve up the vocabulary document as JSONP; or
> 	2. Serve it up as something else plus CORS headers.
> 
> #1 is problematic because as I said, JSONP is not nice, safe JSON,
> despite the similar names. JSONP is Javascript.
> 
> #2 is problematic because CORS is a very new feature. Many of the newest
> browsers support it (including IE8), but if you want your script to work
> in downlevel browsers, this is not your solution.
> 
> In my next message I'll outline how my RDFa vocab proposal (which is
> slightly different to Manu's) makes this a moot point by saying that
> retrieval of the vocab document is optional - a SHOULD requirement
> rather than a MUST - and provides a fallback in the case, e.g. of
> browsers which don't implement CORS.
> 

-- 

Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
PGP Key: http://www.ivan-herman.net/pgpkey.html
FOAF   : http://www.ivan-herman.net/foaf.rdf
vCard  : http://www.ivan-herman.net/HermanIvan.vcf



Received on Friday, 5 March 2010 08:19:52 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 27 April 2012 04:55:06 GMT