SPARQL Federation Over Multiple Data Models (RDB, RDF, CSV, etc.)

Dear All,


I am working in the healthcare and life science (HCLS) domain from last 15
years, and out of which almost 12 years in the intersection of Semantic
Web/Linked Data + HCLS domain.



During these 12 years, one question I was repetitively getting from
collaborating partners (hospitals, clinic, pharma):  Why should we convert
all the existing/legacy data models/sources (RDB, XML) to one single data
model (RDF) ?



I never had a clear answer to this, but I managed to explain them various
benefits of the famous layer cake. I continued building tools, ontologies,
and applications for various HCLS scenarios. For example, 5 years back, we
build a SPARQL query federation engine [1] that federated queries over
three clinical locations in EU. In total three years of project [1], a
significant effort has been invested in building RDF stores from raw
datasets. Eventually, it become difficult to maintain & update the raw
databases together with RDF stores.



We recently built a SPARQL engine that federated over multiple data models
"One Size Does Not Fit All: Querying Web Polystores [2]". One immediate
benefit is not to face the same question: “Why everything in a single data
model” ?



Please note, I am not discouraging RDF, but exploring various ways where
native data stores (RDB, RDF, NoSQL, CSV, etc.) can be exploited directly
without replicating them in different formats.



I thought to share some of my past experiences and inform about our recent
work [2].



PS: I recently moved from Insight (Previously DERI, Galway) to AstraZeneca,
in Cambridge (UK).



[1] http://rdcu.be/oXpB

[2] https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8615997



Best,

Ratnesh

www.ratneshsahay.org

Received on Tuesday, 22 January 2019 15:25:19 UTC