- From: Muntazir Mehdi <muntazir.75@gmail.com>
- Date: Wed, 18 Jun 2014 15:17:51 +0200
- To: Bonnie MacKellar <mackellb@stjohns.edu>
- Cc: Kingsley Idehen <kidehen@openlinksw.com>, "public-lod@w3.org" <public-lod@w3.org>
- Message-ID: <CANh6hxpC6rt83w3yqSfVvAFrD_T2hUgjDh8nDc83-wEcCaOyDw@mail.gmail.com>
Hi, We at INSIGHT, Ireland are working on discovery of relevant datasets, specifically for Life Sciences use-case. A primary proposed for this purpose is an algorithm used for extraction of keywords (from local datasets), based on which single atomic lookups (no keyword search or any proprietary tool) on LOD datasets can be performed, is already published. Another algorithm, which uses the extracted keywords and queries the LOD datasets, was also accepted recently (not in proceedings yet). In our case, we use datasets listed on Bio2RDF Release2 & 3. While working on extending the algorithm, i observed that, the data in Bio2RDF has a high overlap (among datasets) and some data is also illegible. Bonnie, a recent initiative about Bio2RDF Release-3 can be useful for you. In my opinion, most of datasets listed there have live sparql endpoints & a clean set of stats are also available. If you are interested in RDF dumps, you can find them there as well. Bottom line, IMO, current Life Sciences LOD is clearly not in a very good shape, specially, considering the nature of domain, the reliability factor is very low. However, many fellows are working hard to improve it. Cheers, Mehdi On Wednesday, June 18, 2014, Bonnie MacKellar <mackellb@stjohns.edu> wrote: > Hi, > No I don't know this one. Is there any more information? What is the > purpose of this repository? What datasets are cached? I tried clicking on > the datasets link in the About tab, but get an error message " Resource > /void/Dataset not found.". > > So this adds to my confusion. What are the differences between Bio2RDF, > Linked Life Data, and OpenLink? Obviously, included datasets, but I have > compared Bio2RDF and Linked Life Data on this dimension (and will soon, if > I can get a list from OpenLink), but there is a lot of overlap. Other > people must be also facing this choice, no? Are all of these sites stable? > Up to date? How well do they work with tools like Silk? What if I > eventually want to use a crawler like LSSpider instead of dumps, so that my > results stay up to date? I would assume that these are all questions that > application developers who want to use Linked Open Data would be asking. > > Thanks, > Bonnie MacKellar > mackellb@stjohns.edu <javascript:;> > > > -----Original Message----- > From: Kingsley Idehen [mailto:kidehen@openlinksw.com <javascript:;>] > Sent: Wednesday, June 18, 2014 7:52 AM > To: public-lod@w3.org <javascript:;> > Subject: Re: Bio2RDF vs Linked Life Data > > On 6/17/14 5:27 PM, Bonnie MacKellar wrote: > > Yes, in fact, I have been using a dump from that site for most of my > preliminary work. But there is no working SPARQL endpoint, and there is > often a big gap between dumps. Plus, there are other datasets I want to use > as well. I am trying to understand the benefits of using these platforms > that bring everything together. > > > > Bonnie MacKellar > > mackellb@stjohns.edu <javascript:;> > > Have you looked at our live 50 Billion+ triples based LOD Cloud Cache > [1] which does include data loaded from these projects, where the data > is available as an RDF dump. You can start via a simple keyword search. > > Links: > > [1] http://lod.openlinksw.com -- LOD Cloud Cache > [2] http://lod.openlinksw.com/c/IJ3UOS4 -- Default results page for > pattern "Protein" > [3] http://lod.openlinksw.com/c/GYIJAVW -- Entity Types associated with > pattern "Protein" > [4] http://lod.openlinksw.com/c/GYZPJFS -- Entity Relationship Types > (Relations) in which an Entity associated with the pattern "Protein" > plays the role of Subject > [5] http://lod.openlinksw.com/c/F734UKK -- Entity Relationship Types > (Relations) in which an Entity associated with the pattern "Protein" > plays the role of Object. > > -- > > Regards, > > Kingsley Idehen > Founder & CEO > OpenLink Software > Company Web: http://www.openlinksw.com > Personal Weblog: http://www.openlinksw.com/blog/~kidehen > Twitter Profile: https://twitter.com/kidehen > Google+ Profile: https://plus.google.com/+KingsleyIdehen/about > LinkedIn Profile: http://www.linkedin.com/in/kidehen > > > > > > -- *Muntazir Mehdi* Research Intern | Healthcare and Life Sciences Unit INSIGHT @ NUIG (Formerly DERI), Ireland Student | Department of Computer Science Technical University Kaiserslautern, Germany https://sites.google.com/site/muntazir75/
Received on Thursday, 19 June 2014 09:31:17 UTC