Re: Javascript Security for Dummies from Mark Birbeck on 2010-03-05 (public-rdfa-wg@w3.org from March 2010)

From: Mark Birbeck <mark.birbeck@webbackplane.com>
Date: Fri, 5 Mar 2010 09:50:09 +0000
To: Ivan Herman <ivan@w3.org>
Cc: Toby Inkster <tai@g5n.co.uk>, RDFa WG <public-rdfa-wg@w3.org>
Message-ID: <640dd5061003050150l4215798flee9d6952a92d445b@mail.gmail.com>
Hi Ivan,

On Fri, Mar 5, 2010 at 8:20 AM, Ivan Herman <ivan@w3.org> wrote:
> Toby,
>
> first of all, thanks.
>
> I do have a technical question, though, for further clarification. You
> say that, if a data is defined through JSONP, what this means is that
> the HTML file that starts up the whole process should have an additional
> <script...> element to get that. How would that apply to an RDFa file?
> After all, what we are discussing here is a Javascript implementation
> that interprets an RDFa content; would that mean that the RDFa _author_
> is supposed to add a <script...> element into his/her file instead of
> using @profile (or @vocab)? And what happens then with XML formats that
> do not have such <script> element in the first place?

This is almost always handled by the JavaScript library, automatically.

To illustrate, let's forget about RDFa for the moment, and look at at
a fairly typical use of JSONP -- requesting someone's Tweets. (We
could equally have chosen the Flickr API, YouTube...blah blah...JSONP
is everywhere.)


TWITTER AND JSON

So if I have a JavaScript application, and I want to show your tweets
in that app, all I need to do is use this URL to get your latest 20
tweets:

  <http://api.twitter.com/1/statuses/user_timeline/Ivan_Herman.json>

If you look closely, you'll see that what is returned is an array of data:

  [
    {
      "in_reply_to_user_id":null,
      "created_at":"Tue Mar 02 09:09:05 +0000 2010",
      "source":"<a href=\"http://bit.ly\" rel=\"nofollow\">bit.ly</a>",
      "geo":null,
      .
      .
      .
  ]

As Toby has already explained, retrieving this data via XHR is not
going to help us because the array will sit in memory, retrieved but
unprocessed, due to security constraints. Of course, other
applications like your Python parser could process this data by
retrieving it whilst running on a server, but browsers cannot.

But even if we ignore XHR, there's no point in putting this URL into a
script tag, because the object returned has no name -- the JavaScript
will be processed but our application can't reach it.


TWITTER AND JSONP

If you tweak the URL to this:

  <http://api.twitter.com/1/statuses/user_timeline/Ivan_Herman.json?callback=addTweetsToTS>

you'll see that the returned JavaScript now provides the same array to
us but this time as a parameter to a function call:

  addTweetsToTS([
    {
      "in_reply_to_user_id":null,
      "created_at":"Tue Mar 02 09:09:05 +0000 2010",
      "source":"<a href=\"http://bit.ly\" rel=\"nofollow\">bit.ly</a>",
      "geo":null,
      .
      .
      .
  ]);

It's still no good to use with XHR, because of the security
considerations. But it is now much more useful when the URL is placed
in a script tag, because the JavaScript application will get a chance
to do something with the data when it's returned.

To make use of this, we'd insert something like this:

  <script
    src="http://api.twitter.com/1/statuses/user_timeline/Ivan_Herman.json?callback=addTweetsToTS"
    lang="text/javascript"
    >/* */</script>

But of course this is hard-coded to Ivan's tweets.

Since browsers will run the script, even if it's added after the
document has loaded, we can just insert the script tag later in the
process. And since we can construct the script tag in any way we like,
we can insert your name as a parameter.

In other words we can write a function that would be invoked like this:

  getTweets( "Ivan_Herman" );

This simple function would then create a script tag with the correct
URL for the right tweets, then insert that tag into the document, and
that would in turn cause the script to run.

We would now have a way to dynamically retrieve anyone's tweets and do
something with them.

An example of this technique, but using RDFa to provide the names for
the tweets to retrieve, is here:

  <http://backplanejs.appspot.com/samples/rdfa/blog-twitter.html>


RDFA, PROFILES AND JSONP

So, we've seen that JSONP is generally used programmatically, rather
than authors adding the script tags themselves, so how does this
affect the RDFa profiles proposal?

The first thing to stress is that in my proposal [1] I suggested that
the URL for a profile doesn't have any file extension -- i.e., it's a
key. For example:

  <head profile="http://rdf.data-vocabulary.org/profile/2010">

However, the browser or the JavaScript RDFa parser can then take this
URL and turn it into this:

  <script
    src="http://rdf.data-vocabulary.org/profile/2010.json?callback=document.meta.addMappings"
    lang="text/javascript"
    >/* */</script>

This would result in the browser receiving something like this:

  document.meta.addMappings(
    {
      "Person": "http://rdf.data-vocabulary.org/#Person",
      "name": "http://rdf.data-vocabulary.org/#name",
      .
      .
      .
    }
  );

Regards,

Mark

[1] <http://webbackplane.com/mark-birbeck/blog/2010/02/vocabularies-token-bundles-profiles-rdfa>

--
Mark Birbeck, webBackplane

mark.birbeck@webBackplane.com

http://webBackplane.com/mark-birbeck

webBackplane is a trading name of Backplane Ltd. (company number
05972288, registered office: 2nd Floor, 69/85 Tabernacle Street,
London, EC2A 4RR)
Received on Friday, 5 March 2010 09:50:45 UTC