- From: Benjamin Adrian <benjamin.adrian@dfki.de>
- Date: Fri, 05 Mar 2010 08:54:38 +0100
- To: Toby Inkster <tai@g5n.co.uk>
- CC: RDFa WG <public-rdfa-wg@w3.org>
Hi Toby, Thanks a lot for writing this summary. It helped me a lot. Best regards, Benjamin Toby Inkster schrieb: > Apologies for cutting out of the telecon a few minutes early. As I left > I volunteered to write a quick summary of how Javascript/JSON/JSONP/CORS > relate to each other and the various security issues involved. > > Let's start with the basics: Javascript (more properly known as > ECMAScript these days) is a scripting language with various > implementations, the best known being the ones that are embedded in most > modern, graphical browsers. Javascript, as implemented in browsers, runs > in a sandbox to prevent maliciously crafted web pages from doing any > damage to the visitor's machine. > > A good number of years ago, Microsoft implemented a proprietary > extension to Javascript which allowed scripts to perform HTTP requests > and make use of the responses. Mozilla implemented some fairly similar > functionality, and eventually other browsers followed (using the Mozilla > syntax). This feature is called XmlHttpRequest or XHR - it's a bit of a > misnomer as it's not restricted to retrieving XML. > > For security reasons, XHR requests are only allowed to be performed to > URLs on the same domain as the script itself was loaded from. This is a > good thing because you don't want http://evil.example/badpage.html to be > able to perform an XHR to <http://bank.example/account-statement.html> > especially given that the XHR would be sent with all the applicable > cookies in the browser's cookie jar. > > (Aside: technically it's not cross-domain requests that are disallowed, > but cross-origin requests. An origin is a slightly wooly concept: > foo.example.com and bar.example.com are considered to be the same > origin, foo.co.uk and bar.co.uk are different origins, despite the fact > that from a DNS viewpoint, they're both third-level domain names. This > is all being standardised currently.) > > Douglas Crockford "discovered" JSON. He maintains that he didn't invent > it, just realise that Javascript contained a useful bit of syntax that > could be standardised on. JSON is a restricted subset of Javascript's > notation for objects and arrays. (JSON = Javascript Object Notation.) > JSON is a data format that allows strings, numbers, booleans, arrays and > associative arrays to be represented. In many ways it can be considered > a competitor to XML. It's also pretty similar to YAML (though the > oft-quoted statement that it's a subset of YAML is an urban myth). > Here's how a person might be represented in JSON: > > { > "name" : "Toby Inkster" , > "homepage" : "http://tobyinkster.co.uk/" , > "mbox" : "mailto:mail@tobyinkster.co.uk" > } > > Getting back to XHR, often people want to be able to request data from > other origins, circumventing the same-origin policy enforced by > browsers. With a little bit of extra syntax, JSON can be useful for > this. This extra syntax is called JSONP. (JSONP = JSON plus Payload.) > > The way that JSONP works is that instead of supplying a JSON response, > the server responding with the data sends a Javascript response, like > this (usually the name of the callback function is configurable as a > query string): > > callback_function({ > "name" : "Toby Inkster" , > "homepage" : "http://tobyinkster.co.uk/" , > "mbox" : "mailto:mail@tobyinkster.co.uk" > }); > > How does this circumvent XHR's same-origin policy? Answer: it doesn't. > But it eliminates the need to use XHR at all. The page requesting the > data doesn't need to perform an XHR request, it just defines a function > called callback_function to deal with the data, then it loads the JSONP > file using a standard HTML <script src> element. The browser downloads > and executes the script, and calls the function with the data as a > parameter. > > However, this opens up a big security hole. Suppose that the server > supplying a JSONP response is compromised, or its owner just decides to > turn to the dark side. The server can send arbitrary Javascript (i.e. > not JSONP) and the browser will execute it unquestioningly. This > Javascript could be used to steal cookies, passwords and other > privileged information from the page it was included in. Not nice. > > CORS is another way around the same-origin policy, but this time it's > not a hack. It's a set of HTTP headers that a URL can respond with to > indicate that it's safe to be retrieved in cross-origin requests. So if, > say, http://bank.example/homepage.html contains no private data and is > perfectly safe for other sites to have access to, then it could set a > CORS header to allow http://evil.example/badpage.html to try its worst. > http://bank.example/account-statement.html wouldn't send the CORS header > so would be protected by the default same-origin policy. > > So how does this apply to RDFa vocabularies/profiles? > > If vocabularies are hosted on a separate server to the pages making use > of them, then Javascript implementations of RDFa would need to make a > cross-origin request to read them. (Actually they could make a > same-origin request to a proxying script, but that's not an especially > elegant solution.) > > If we want such Javascript implementations of RDFa to be possible, this > allows two solutions: > > 1. Serve up the vocabulary document as JSONP; or > 2. Serve it up as something else plus CORS headers. > > #1 is problematic because as I said, JSONP is not nice, safe JSON, > despite the similar names. JSONP is Javascript. > > #2 is problematic because CORS is a very new feature. Many of the newest > browsers support it (including IE8), but if you want your script to work > in downlevel browsers, this is not your solution. > > In my next message I'll outline how my RDFa vocab proposal (which is > slightly different to Manu's) makes this a moot point by saying that > retrieval of the vocab document is optional - a SHOULD requirement > rather than a MUST - and provides a fallback in the case, e.g. of > browsers which don't implement CORS. > > -- __________________________________________ Benjamin Adrian Email : benjamin.adrian@dfki.de WWW : http://www.dfki.uni-kl.de/~adrian/ Tel.: +49631 20575 145 __________________________________________ Deutsches Forschungszentrum für Künstliche Intelligenz GmbH Firmensitz: Trippstadter Straße 122, D-67663 Kaiserslautern Geschäftsführung: Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender) Dr. Walter Olthoff Vorsitzender des Aufsichtsrats: Prof. Dr. h.c. Hans A. Aukes Amtsgericht Kaiserslautern, HRB 2313 __________________________________________
Received on Friday, 5 March 2010 07:55:32 UTC