ISSUE-28: Nathan's RDFa API Questions and Comments (e-mail 1) from Manu Sporny on 2010-08-01 (public-rdfa-wg@w3.org from August 2010)

From: Manu Sporny <msporny@digitalbazaar.com>
Date: Sun, 01 Aug 2010 14:14:23 -0400
To: RDFa Working Group <public-rdfa-wg@w3.org>
Message-ID: <4C55B97F.3030504@digitalbazaar.com>
Hi Nathan,

Apologies on the late reply to your input on the RDFa API document. We
have been very busy with the RDFa Core document and now that we've
approved publication of an RDFa Core and XHTML+RDFa heartbeat document,
we are going to focus on the RDFa API document.

As you may have noticed, I wrapped all of your feedback into an RDFa WG
ISSUE so that we may address all of your concerns:

http://www.w3.org/2010/02/rdfa/track/issues/28

More feedback below:

On 06/08/2010 10:24 PM, Nathan wrote:
> First, to perhaps contribute something (if it hasn't already been
> suggested in the archives).
> 
> I noted under future discussion, the following point:
>   'A mechanism to load and process triples from remote documents.'
> 
> The RDFa API currently provides the following method:
>   parser.parse( document );
> 
> And XHR [1] has the following attribute:
>   xhr.responseXML
> 
> which returns a 'Document'
> 
> So this may already be covered for any mediatype which is text/xml,
> application/xml or ends in +xml.
> 
> Outside of this there is the DOMImplementation.createDocument method,
> but I'm unsure how you could turn the XHR.responseText in to a Document
> (surely there must be a way??)

The mechanism to load and process triples from remote documents is a bit
more involved than that. We could do what you say and depend on XHR, but
we were wondering if we could enable something like this:

parser.parse( url );

That is, could we enable the RDFa DOM API to extract the triples from a
remote document while ensuring that CORS and XSS issues are mitigated.
One way that we could approach this is to perform the request for the
remote URL without sending any cookies or other identifying information.
That is, parser.parse(url), would use virgin headers when accessing a
cross-site resource. This would only apply to the browser environment -
which would allow non-browser environments that want to send cookies to
cross-site resources to have full access when extracting triples.

I think that this would solve any XSS problems while simultaneously
allowing stuff like this:

// discover movie information for "Inception"

parser.parse("http://www.freebase.com/view/m/0661ql3");
parser.parse("http://www.imdb.com/title/tt1375666/");
parser.parse("http://www.rottentomatoes.com/m/inception/");

> 1: how do you get all data (triples)?
> ( read this as, please consider adding a DataStore.getAll() method )

// passing in null for everything in filter() retrieves all triples
var allTriples = document.data.store.filter();

The language was incorrect and didn't allow the subject to be optional
in the filter() method. I've made the change to allow a zero-argument
filter() method. I also added an example of how one can retrieve all
triples.

> 2: merging stores?
> given the following example:
> 
> var rdfa = document.data.createParser("rdfa",
> document.data.createStore() );
> rdfa.parse();
> var hcard = document.data.createParser("hCard",
> document.data.createStore() );
> hcard.parse();
> 
> then rdfa.store will hold all the rdfa data, and hcard.store will hold
> all the hcard data. (?) how would one merge all the data from the two
> stores in to a single new one?

Good point, we had thought previously that one would just write a
function themselves to merge two stores as there are many ways to do
this. However, we might as well provide a utility function. I've added
the following method to the API:

DataStore.merge( store )

-- manu

-- 
Manu Sporny (skype: msporny, twitter: manusporny)
President/CEO - Digital Bazaar, Inc.
blog: WebApp Security - A jQuery Javascript-native SSL/TLS library
http://blog.digitalbazaar.com/2010/07/20/javascript-tls-1/
http://blog.digitalbazaar.com/2010/07/20/javascript-tls-2/
Received on Sunday, 1 August 2010 18:14:54 UTC