Re: RDFa + RDF/XML Considered Harmful? (was RE: Ordnance Survey data as Linked Data)

Hi folks

(sorry for not chiming in on LOD list before btw; I tried to join 
sometime back but it got wedged, but I think that was the old MIT-hosted 
list anyways)

Richard Cyganiak wrote:

> I don't think this is a problem. For provenance purposes, whatever works 
> for RDF/XML documents will also work for HTML+RDFa documents. Just think 
> of RDFa as a very verbose RDF syntax that contains a lot of “comments” 
> (the non-RDF, pure-HTML parts of the document). In the end, an RDF agent 
> just sees triples, no matter if they are parsed out of an HTML+RDFa 
> document or an RDF/XML document.

It's a very reasonable question.

For my homepage I some small pieces of experimental RDFa in there, in 
addition to my FOAF files. If you ever see a foaf:name for me that is 
"Dan A. Brickley" you've probably found something whose provenance 
traces to that homepage markup.

<div typeof="foaf:Person" property="foaf:name" content="Dan A. 
Brickley"> ...etc

We had an interesting chat in the #laconica IRC channel last week (I'll 
wikifi the notes soon I hope; not publically logged yet) with Brad 
Fitzpatrick, about how Laconica and the Google Social Graph API might 
relate to each other. The key idea here is discovering which profile etc 
pages have rel=me (or equiv FOAF) "claims" of which other pages. When 
Google SGAPI

Brad Fitzpatrick: "just because a person tells friendfeed that they're 
Bob on Twitter and Bob2 on Jaiku, that doesn't make it true."

But if the Twitter account 'bob' has rel=me pointing to the Jaiku 
account, and jaiku's bob2 page has the same pointing back, that suggests 
they're at least telling a mutually consistent story.



How does all this relate to the original question? :)

Humm, well the way I see it, we'll want SPARQL / named graph conventions 
for taking all the chunks of RDF data that are reasonably ascribed to 
me, and making them queryable as a unit. I can see two styles here: put 
them all in a single named graph (maybe with a tag: or uuid: URI to 
avoid confusion), or else put claims in another graph (maybe default 
graph, or a 'table of contents' graph).

Sure there are contexts in which it is important to know that a certain 
triple came from the RDFa in my homepage, versus in my RSS feed, versus 
in my foaf.rdf or one extracted from Flickr or loaded off my Friendfeed, 
MyBlogLog, or Pownce or Ecademy accounts.

But it would be even more powerful if we could also allow them to be 
grouped together, so people could ask the SPARQL store, "what values are 
there for foaf:name of the person whose :openid is <http://danbri.org/> ?".

Now SGAPI has some things that help here. And a similar API could be 
exposed by SINDICE or Garlik/qdos too. You can say, "What are the IDs of 
the 'other mes' on the Web?".

Here's mine:

http://socialgraph.apis.google.com/otherme?pretty=1&q=danbri.org
(yep there are probably bugs). This is sugar around the existing SGAPI 
calls, and is based on the notion of reciprocated claims between profile 
pages.

(btw, some of these URIs are indirected identifiers (ie. of documents); 
but let's not go off on one of those people-versus-doc URI threads right 
now? :)

So.... conclusions?

Knowing where/when to split data into multiple chunks versus serve from 
a single (possibly aggregated) source is a timeless problem. Different 
people and parties will split their data up in multiple ways; that is 
inevitable. We have in SPARQL and the RDF environment a glimmer of hope: 
data split up into multiple files can still be tracked down as having 
the same real-world source or publisher. And we have a couple of ways we 
can represent this: put them in one named graph in a sparql store, or 
put in multiple named graphs, but use the URIs of those graphs to group 
them.

Plausible?

cheers,

Dan

--
http://danbri.org/

Received on Monday, 14 July 2008 15:23:01 UTC