Getting Freebase onto the Semantic Web

Hi John,

John Giannandrea wrote:
> Chris Bizer wrote:
>> See: 
>> http://blog.freebase.com/2008/03/28/full-data-dumps-are-now-available/
>> I think that it would really be exiting to turn these dumps into 
>> RDF,
>> publish them on the Web as Linked Data and interlink them with data 
>> sets
>> from the LOD cloud. For instance, interlinking them with DBpedia 
>> should be
>> very easy as both datasets contain Wikipedia article identifiers.
>
> We would be happy to help support this effort to make our data more 
> LOD friendly.

This would be great.

Getting the data out on the Semantic Web as Linked Data also don't 
have to be a big effort as you are already having everything that is 
needed in place.

> One reason we did not yet emit simple RDF ourselves was potential 
> confusion about mapping specific freebase properties to the larger 
> range of possible ontologies.  It would be simple to declare a new 
> set  of URIs for our schema, much harder to pick and choose from the 
> large  array of available ontologies for the range of our data.

I think for the first iteration it is completely OK if you define a 
new set of URIs for your schema. As a second iteration you could 
replace terms from your schema with terms from well-known vocabularies 
like FOAF or SKOS.

>From the LOD perspective a lot would already be won if:

1. there would be a URI for each topic in Freebase and dereferencing 
this URI over the Web would return a RDF description of the concept 
using a Freebase specific schema.
2. this URI would be interlinked with other data sourcesin the LOD 
cloud, so that people could use Ssemantic Web browsers to navigate 
from these data sources into the Freebase data and so that Semantic 
Web crawlers can find and index the data.

So, a minimal effort approach to getting Freebase onto the Semantic 
Web could look like this:

1. Define URIs for all your concepts, somethink like 
http://www.freebase.com/rdf/resource/9202a8c04000641f800000000016a1a7

2. Deploy a Linked Data wrapper around your API that returns an RDF 
description of (in the example above) the film when somebody 
dereferences the URI above. A very easy way to implement such a 
wrapper would be to just tweek the PHP script that we are using for 
the RDF Book mashup. The script is found at 
http://www4.wiwiss.fu-berlin.de/bizer/bookmashup/index.html

3. Interlink this RDF Version of Freebase with other data sources. The 
simplest option here would be to interlink Freebase with DBpedia as 
both dataset contain Wikipedia article IDs. So what you would do is to 
add a RDF link stating that a specific concept in Freebase is the same 
as a concept in DBpedia to the RDF you return when one of your URIs 
gets dereferenced. For instance:

http://www.freebase.com/rdf/resource/9202a8c04000641f800000000016a1a7 
owl:sameAs http://dbpedia.org/resource/2046_(film)

4. You would send us an RDF file containing these RDF links for all 
Freebase concepts and we would load it into DBpedia and also serve 
these links.

I think all this could be done within 3 days work and would allow 
Linked Data browsers, like the ones listed here 
http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/SemWebClients, 
to access and navigate between both datasets and would allow crawlers, 
like the ones listed here 
http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/SemanticWebSearchEngines, 
to index both datasets.

What do you think?

Technical background information about the whole process is found in 
http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/

After his, one could start thinking about also providing RDF dumps, so 
that people could load Freebase and DBpedia together into a RDF store 
and do whatever they want with the data. Or think about using well 
known terms from other vocabularies and ontologies.

> We have been experimenting with using freebase itself to help 
> catalog  compatible ontologies for specific freebase properties.
> For example 
> http://www.freebase.com/view/user/jamie/web_ontology/property_mapping
>
> If folks want to help with this, then it should be possible to use 
> our  open API to generate RDF of whatever 'flavor' you happen to be 
> working  with, by specifying a preferred set of ontologies at query 
> time.

Using terms from well-known vocabularies as well as serving the data 
using different vocabularies is both important, but in my opinion 
something for the second step. First step: Publish linked data. See 
what people do with it.

Cheers

Chris


> -jg
>
>
> 

Received on Monday, 31 March 2008 10:32:36 UTC