- From: Ivan Herman <ivan@w3.org>
- Date: Fri, 05 Mar 2010 09:20:03 +0100
- To: Toby Inkster <tai@g5n.co.uk>
- CC: RDFa WG <public-rdfa-wg@w3.org>
- Message-ID: <4B90BEB3.3060808@w3.org>
Toby, first of all, thanks. I do have a technical question, though, for further clarification. You say that, if a data is defined through JSONP, what this means is that the HTML file that starts up the whole process should have an additional <script...> element to get that. How would that apply to an RDFa file? After all, what we are discussing here is a Javascript implementation that interprets an RDFa content; would that mean that the RDFa _author_ is supposed to add a <script...> element into his/her file instead of using @profile (or @vocab)? And what happens then with XML formats that do not have such <script> element in the first place? Ivan On 2010-3-4 20:37 , Toby Inkster wrote: > Apologies for cutting out of the telecon a few minutes early. As I left > I volunteered to write a quick summary of how Javascript/JSON/JSONP/CORS > relate to each other and the various security issues involved. > > Let's start with the basics: Javascript (more properly known as > ECMAScript these days) is a scripting language with various > implementations, the best known being the ones that are embedded in most > modern, graphical browsers. Javascript, as implemented in browsers, runs > in a sandbox to prevent maliciously crafted web pages from doing any > damage to the visitor's machine. > > A good number of years ago, Microsoft implemented a proprietary > extension to Javascript which allowed scripts to perform HTTP requests > and make use of the responses. Mozilla implemented some fairly similar > functionality, and eventually other browsers followed (using the Mozilla > syntax). This feature is called XmlHttpRequest or XHR - it's a bit of a > misnomer as it's not restricted to retrieving XML. > > For security reasons, XHR requests are only allowed to be performed to > URLs on the same domain as the script itself was loaded from. This is a > good thing because you don't want http://evil.example/badpage.html to be > able to perform an XHR to <http://bank.example/account-statement.html> > especially given that the XHR would be sent with all the applicable > cookies in the browser's cookie jar. > > (Aside: technically it's not cross-domain requests that are disallowed, > but cross-origin requests. An origin is a slightly wooly concept: > foo.example.com and bar.example.com are considered to be the same > origin, foo.co.uk and bar.co.uk are different origins, despite the fact > that from a DNS viewpoint, they're both third-level domain names. This > is all being standardised currently.) > > Douglas Crockford "discovered" JSON. He maintains that he didn't invent > it, just realise that Javascript contained a useful bit of syntax that > could be standardised on. JSON is a restricted subset of Javascript's > notation for objects and arrays. (JSON = Javascript Object Notation.) > JSON is a data format that allows strings, numbers, booleans, arrays and > associative arrays to be represented. In many ways it can be considered > a competitor to XML. It's also pretty similar to YAML (though the > oft-quoted statement that it's a subset of YAML is an urban myth). > Here's how a person might be represented in JSON: > > { > "name" : "Toby Inkster" , > "homepage" : "http://tobyinkster.co.uk/" , > "mbox" : "mailto:mail@tobyinkster.co.uk" > } > > Getting back to XHR, often people want to be able to request data from > other origins, circumventing the same-origin policy enforced by > browsers. With a little bit of extra syntax, JSON can be useful for > this. This extra syntax is called JSONP. (JSONP = JSON plus Payload.) > > The way that JSONP works is that instead of supplying a JSON response, > the server responding with the data sends a Javascript response, like > this (usually the name of the callback function is configurable as a > query string): > > callback_function({ > "name" : "Toby Inkster" , > "homepage" : "http://tobyinkster.co.uk/" , > "mbox" : "mailto:mail@tobyinkster.co.uk" > }); > > How does this circumvent XHR's same-origin policy? Answer: it doesn't. > But it eliminates the need to use XHR at all. The page requesting the > data doesn't need to perform an XHR request, it just defines a function > called callback_function to deal with the data, then it loads the JSONP > file using a standard HTML <script src> element. The browser downloads > and executes the script, and calls the function with the data as a > parameter. > > However, this opens up a big security hole. Suppose that the server > supplying a JSONP response is compromised, or its owner just decides to > turn to the dark side. The server can send arbitrary Javascript (i.e. > not JSONP) and the browser will execute it unquestioningly. This > Javascript could be used to steal cookies, passwords and other > privileged information from the page it was included in. Not nice. > > CORS is another way around the same-origin policy, but this time it's > not a hack. It's a set of HTTP headers that a URL can respond with to > indicate that it's safe to be retrieved in cross-origin requests. So if, > say, http://bank.example/homepage.html contains no private data and is > perfectly safe for other sites to have access to, then it could set a > CORS header to allow http://evil.example/badpage.html to try its worst. > http://bank.example/account-statement.html wouldn't send the CORS header > so would be protected by the default same-origin policy. > > So how does this apply to RDFa vocabularies/profiles? > > If vocabularies are hosted on a separate server to the pages making use > of them, then Javascript implementations of RDFa would need to make a > cross-origin request to read them. (Actually they could make a > same-origin request to a proxying script, but that's not an especially > elegant solution.) > > If we want such Javascript implementations of RDFa to be possible, this > allows two solutions: > > 1. Serve up the vocabulary document as JSONP; or > 2. Serve it up as something else plus CORS headers. > > #1 is problematic because as I said, JSONP is not nice, safe JSON, > despite the similar names. JSONP is Javascript. > > #2 is problematic because CORS is a very new feature. Many of the newest > browsers support it (including IE8), but if you want your script to work > in downlevel browsers, this is not your solution. > > In my next message I'll outline how my RDFa vocab proposal (which is > slightly different to Manu's) makes this a moot point by saying that > retrieval of the vocab document is optional - a SHOULD requirement > rather than a MUST - and provides a fallback in the case, e.g. of > browsers which don't implement CORS. > -- Ivan Herman, W3C Semantic Web Activity Lead Home: http://www.w3.org/People/Ivan/ mobile: +31-641044153 PGP Key: http://www.ivan-herman.net/pgpkey.html FOAF : http://www.ivan-herman.net/foaf.rdf vCard : http://www.ivan-herman.net/HermanIvan.vcf
Attachments
- application/pkcs7-signature attachment: S/MIME Cryptographic Signature
Received on Friday, 5 March 2010 08:19:52 UTC