RE: Bio2RDF vs Linked Life Data

Thanks, that is useful! I’ve already taken a look at Bio2RDF Release 3, and it seems promising.

I think keeping the data up to date is another challenge, at least for some of the datasets. And the problem of broken links – when I first started looking at this, I ran into a LOT of broken links.

Bonnie MacKellar
mackellb@stjohns.edu

From: Muntazir Mehdi [mailto:muntazir.75@gmail.com]
Sent: Wednesday, June 18, 2014 9:18 AM
To: Bonnie MacKellar
Cc: Kingsley Idehen; public-lod@w3.org
Subject: Re: Bio2RDF vs Linked Life Data

Hi,

We at INSIGHT, Ireland are working on discovery of relevant datasets, specifically for Life Sciences use-case. A primary proposed for this purpose is an algorithm used for extraction of keywords (from local datasets), based on which single atomic lookups (no keyword search or any proprietary tool)  on LOD datasets can be performed, is already published. Another algorithm, which uses the extracted keywords and queries the LOD datasets, was also accepted recently (not in proceedings yet). In our case, we use datasets listed on Bio2RDF Release2 & 3.
While working on extending the algorithm, i observed that, the data in Bio2RDF has a high overlap (among datasets) and some data is also illegible.
Bonnie, a recent initiative about Bio2RDF Release-3 can be useful for you. In my opinion, most of datasets listed there have live sparql endpoints & a clean set of stats are also available. If you are interested in RDF dumps, you can find them there as well.

Bottom line, IMO, current Life Sciences LOD is clearly not in a very good shape, specially, considering the nature of domain, the reliability factor is very low. However, many fellows are working hard to improve it.

Cheers,
Mehdi

On Wednesday, June 18, 2014, Bonnie MacKellar <mackellb@stjohns.edu<mailto:mackellb@stjohns.edu>> wrote:
Hi,
No I don't know this one.  Is  there any more information?  What is the purpose of this repository? What datasets are cached? I tried clicking on the datasets link in the About tab, but get an error message " Resource /void/Dataset not found.".

So this adds to my confusion. What are the differences between Bio2RDF, Linked Life Data, and OpenLink? Obviously, included datasets, but I have compared Bio2RDF and Linked Life Data on this dimension (and will soon, if I can get a list from OpenLink), but there is a lot of overlap.  Other people must be also facing this choice, no? Are all of these sites stable? Up to date? How well do they work with tools like Silk? What if I eventually want to use a crawler like LSSpider instead of dumps, so that my results stay up to date? I would assume that these are all questions that application developers who want to use Linked Open Data would be asking.

Thanks,
Bonnie MacKellar
mackellb@stjohns.edu<javascript:;>


-----Original Message-----
From: Kingsley Idehen [mailto:kidehen@openlinksw.com<javascript:;>]
Sent: Wednesday, June 18, 2014 7:52 AM
To: public-lod@w3.org<javascript:;>
Subject: Re: Bio2RDF vs Linked Life Data

On 6/17/14 5:27 PM, Bonnie MacKellar wrote:
> Yes, in fact, I have been using a dump from that site for most of my preliminary work. But there is no working SPARQL endpoint, and there is often a big gap between dumps. Plus, there are other datasets I want to use as well.  I am trying to understand the benefits of using these platforms that bring everything together.
>
> Bonnie MacKellar
> mackellb@stjohns.edu<javascript:;>

Have you looked at our live 50 Billion+ triples based LOD Cloud Cache
[1] which does include data loaded from these projects, where the data
is available as an RDF dump. You can start via a simple keyword search.

Links:

[1] http://lod.openlinksw.com -- LOD Cloud Cache
[2] http://lod.openlinksw.com/c/IJ3UOS4 -- Default results page for
pattern "Protein"
[3] http://lod.openlinksw.com/c/GYIJAVW -- Entity Types associated with
pattern "Protein"
[4] http://lod.openlinksw.com/c/GYZPJFS -- Entity Relationship Types
(Relations) in which an Entity associated with the pattern "Protein"
plays the role of Subject
[5] http://lod.openlinksw.com/c/F734UKK -- Entity Relationship Types
(Relations) in which an Entity associated with the pattern "Protein"
plays the role of Object.

--

Regards,

Kingsley Idehen
Founder & CEO
OpenLink Software
Company Web: http://www.openlinksw.com

Personal Weblog: http://www.openlinksw.com/blog/~kidehen

Twitter Profile: https://twitter.com/kidehen

Google+ Profile: https://plus.google.com/+KingsleyIdehen/about

LinkedIn Profile: http://www.linkedin.com/in/kidehen







--

Muntazir Mehdi
Research Intern | Healthcare and Life Sciences Unit
INSIGHT @ NUIG (Formerly DERI), Ireland
Student | Department of Computer Science
Technical University Kaiserslautern, Germany
https://sites.google.com/site/muntazir75/

Received on Wednesday, 18 June 2014 15:00:29 UTC