Re: Big data applications for general users based on RDF - where are they? from Andrea Splendiani on 2013-06-22 (public-lod@w3.org from June 2013)

From: Andrea Splendiani <andrea.splendiani@iscb.org>
Date: Sun, 23 Jun 2013 06:02:52 +0900
To: Todd DeLuca <todddeluca@gmail.com>
Cc: doint@oldman.me.uk, "jyoung@oclc.org" <jyoung@oclc.org>, "hg@ecs. soton. ac. uk" <hg@ecs.soton.ac.uk>, "public-lod@w3 org" <public-lod@w3.org>
Message-Id: <6F9408C9-B258-4F9C-AE04-AEBFB45FD519@iscb.org>
Hi,

you are hitting a good point, that is like the elephant in the room:
> - the vast majority of trivial examples I see use FOAF, but I'm making a bioinformatics app.
> - there are multiple orthology predicates defined in ontologies such as the Homology Ontology and the Sequence Ontology.
> - as far as I recall, the few existing implementation of orthology databases made up their own predicates, which misses the whole point of shared URIs for entities and properties, IMO.
It's much harder than it looks to simply re-use a term. It's fine for simple things we all know and agree about (addresses, geo-locations), but when it comes to complex things like "gene" and the like, there are different conceptualizations behind and it's a bit naif to just re-use a term, as there will be hardly enough context available to decide whether it corresponds to the intended meaning.

I guess there is no much to do about it. For complex subjects we will see different dictionaries and lot of emails until ideas converge.

> On the technical side, another enormous challenge has been the lack of a great open-source database.  So far I've used StarDog, Sesame, Virtuoso (open source), and OWLIM-Lite.  They have a wide range of quirks and differences between them in terms of loading speed, inferencing capabilities, scalability, ease-of-installation and configuration, query performance, APIs, edge cases of standards (like how to treat an empty named graph), etc.  
> 
> Were I to do this project as a relational database, I would `apt-get install postgres`, create some tables, make up my column names, and be done with it, and get on with writing queries and a friendly web UI for the my non-techical users, who do not know how to use grep or sed. :-)  I hardly exaggerate when I say it is the difference between one week to implement a RDBMS solution versus 2 months for the RDF/Semantic Web/LoD solution.
> 
> OBVIOUSLY the semantic web is the future of data integration, but currently the cultural and technical costs of implementing a project must be too high for most people.
I think all the above simply reflect an issue with the critical mass of the industry. It's just that Sem-Web/Linked Data is very visible (given its potential), so the attention it gets is disproportionally more than the industry that sustains it at the present (both in terms of providers and use cases). The focus is also more at the infrastructure level.

(Not that different from bioinformatics... 10 years ago).

best,
Andrea

> 
> Cheers,
> Todd
> 
> 
> 
> 
> 
> 
> 
> 
> 
> On Sat, Jun 22, 2013 at 3:21 PM, Dominic Oldman <doint@oldman.me.uk> wrote:
> I think it well worth copying Jeff's initial response. I would be interested in responses to it.
> 
> "It’s pretty easy to write an XSL stylesheet to convert “records” into RDF/XML, and then write a little M/R job to run the XSL against a big bulk of records to boil it down.
> 
> The intellectual challenge is the semantic mapping of idiomatic data into RDF vocabulary terms.
> 
> Jeff"
> 
> Dominic
> 
> Sent from Yahoo! Mail on Android
> 
> 
> From: Hugh Glaser <hg@ecs.soton.ac.uk>; 
> To: Young,Jeff (OR) <jyoung@oclc.org>; 
> Cc: doint@oldman.me.uk <doint@oldman.me.uk>; public-lod@w3 org <public-lod@w3.org>; 
> Subject: Re: Big data applications for general users based on RDF - where are they? 
> Sent: Sat, Jun 22, 2013 6:04:57 PM 
> 
> Ah, now yer rocking!
> But you didn't mention sed (and vi) :-)
> 
> On 22 Jun 2013, at 18:57, "Young,Jeff (OR)" <jyoung@oclc.org>
> wrote:
> 
> > Hugh,
> > 
> > Sorry, you're right. I overlooked the "non-technical uses" phrase in Dominic's message.
> > 
> > Let me spin it a little differently, then. If you're a techie, you can use these tools to create N-Triple data-dumps that non-techies can download and use with Unix-style commands like grep and sort and wc.
> > 
> > Jeff
> > 
> >> -----Original Message-----
> >> From: Hugh Glaser [mailto:hg@ecs.soton.ac.uk]
> >> Sent: Saturday, June 22, 2013 1:53 PM
> >> To: Young,Jeff (OR)
> >> Cc: doint@oldman.me.uk; public-lod@w3 org
> >> Subject: Re: Big data applications for general users based on RDF -
> >> where are they?
> >> 
> >> Hi Jeff,
> >> I assume you aren't suggesting that such tools are suitable for "non-
> >> technical users", as Dominic asked.
> >> So you must be saying something else?
> >> That it is pretty easy, but people don't do it?
> >> Hugh
> >> 
> >> On 22 Jun 2013, at 17:27, "Young,Jeff (OR)" <jyoung@oclc.org>
> >> wrote:
> >> 
> >>> It's pretty easy to write an XSL stylesheet to convert "records" into
> >> RDF/XML, and then write a little M/R job to run the XSL against a big
> >> bulk of records to boil it down.
> >>> 
> >>> The intellectual challenge is the semantic mapping of idiomatic data
> >> into RDF vocabulary terms.
> >>> 
> >>> Jeff
> >>> 
> >>> From: Dominic Oldman [mailto:doint@oldman.me.uk]
> >>> Sent: Saturday, June 22, 2013 12:16 PM
> >>> To: public-lod@w3 org
> >>> Subject: Big data applications for general users based on RDF - where
> >> are they?
> >>> 
> >>> 
> >>> Why are there so few useful linked data applications for general non
> >> technical users that provide functions that people need to support and
> >> enhance their work and which operate over large amounts of data owned
> >> by different organisations with a high degree of semantic
> >> interoperability and robustness?
> >>> 
> >>> Dominic
> >>> 
> >>> Sent from Yahoo! Mail on Android
> >>> 
> >>> 
> >> 
> > 
> > 
> 
> 
> 
> 
> 
> -- 
> Todd DeLuca
> Scientific Programmer
> Wall Lab, CBMI, Harvard Medical School
> http://todddeluca.com
> http://wall.hms.harvard.edu/
>
Received on Saturday, 22 June 2013 21:03:23 UTC