2,4 billions triples of Bioinformatics RAW DATA NOW

In his recent talk at TED, Tim Berner Lee invited the data provider to make available data in RDF format to help the building process of linked data web.  He asked them to offer RAW DATA NOW.



We totally share this approach in the Bio2RDF community, our goal is to make public dataset of the bioinformatics domain available in RDF format via standard SPARQL endpoints (Virtuoso server is used for that).  We strongly believe in the semantic web approach to solve science problems but we do not want to wait for data provider to do the RAW DATA conversion job.  Converting data to RDF is not fun, we did a lot of this dirty job, and here are the results for 34 datasets.

Our current datasets in N3 format are available here :

http://quebec.bio2rdf.org/download/n3/

We invite semantic search engine provider to index these files.

The way we produce them is documented in our Wiki at SourceForge in the Cookbook section :

http://bio2rdf.wiki.sourceforge.net/Namespace%27s+update

The actual list of SPARQL endpoints in the linked data cloud is hosted here :

http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/DataSets/Statistics

Bio2RDF 2,4 billions triples graph of linked data represents 51 % of the actual global linked data graph size.

Finally, this is what this highly connected knowledge world look like.



I would take this occasion to thanks all the enthusiast biologist and researcher who invest them self by annotating article, protein, gene product.  Without this essential work of connecting documents and concepts together, this would project would not have been possible.

For the 20th anniversary of the web, I would also want to thanks Tim Berner Lee for his inspiring vision.  Bio2RDF may not be the awaited killer app of the life science to demonstrate the semantic web potential, but let's say that it is only the beginning. 

The WWW2009 workshop Linked Data on the Web (LDOW2009) was held today, I would like to say how important the work of this community is.  Finally a last word to congratulate Virtuoso team and especially Orri Erling for his fantastic work with the new Virtuoso 6.0 server soon to be released.  I cannot wait to see Bio2RDF data into this amazing engine.


--

Posted By  bio2rdf  to  Bio2RDF atlas of post genomic knowledge  at  4/21/2009 11:50:00 PM


      __________________________________________________________________
Obtenez l'adresse qu'il vous faut : @ymail.com or @rocketmail.com. Obtenez votre nouvelle adresse maintenant à  http://cf.new.mail.yahoo.com/addresses.

Received on Wednesday, 22 April 2009 04:50:27 UTC