Re: Dealing with distributed nature of Linked Data and SPARQL from Gray, Alasdair J G on 2016-06-08 (public-lod@w3.org from June 2016)

From: Gray, Alasdair J G <A.J.G.Gray@hw.ac.uk>
Date: Wed, 8 Jun 2016 12:29:11 +0000
To: Martynas Jusevičius <martynas@graphity.org>
CC: public-lod <public-lod@w3.org>, "public-declarative-apps@w3.org" <public-declarative-apps@w3.org>, James Anderson <james@dydra.com>, "Arto Bendiken" <arto@dydra.com>
Message-ID: <71549C33-CFD5-478F-8A11-32730C701CA3@hw.ac.uk>

Option 3 seems sensible, particularly if you keep them in separate graphs.

However shouldn’t you consider the provenance of the sources and prioritise them on how recent they were updated?

Alasdair

On 8 Jun 2016, at 13:06, Martynas Jusevičius <martynas@graphity.org<mailto:martynas@graphity.org>> wrote:

Hey all,

we are developing software that consumes data both from Linked Data
and SPARQL endpoints.

Most of the time, these technologies complement each other. We've come
across an issue though, which occurs in situations where RDF
description of the same resources is available using both of them.

Lest take a resource http://data.semanticweb.org/person/andy-seaborne

as an example. Its RDF description is available in at least 2
locations:
- on a SPARQL endpoint:
http://xmllondon.com/sparql?query=DESCRIBE%20%3Chttp%3A%2F%2Fdata.semanticweb.org%2Fperson%2Fandy-seaborne%3E

- as Linked Data: http://data.semanticweb.org/person/andy-seaborne/rdf

These descriptions could be identical (I haven't checked), but it is
more likely than not that they're out of sync, complementary, or
possibly even contradicting each other, if reasoning is considered.

If a software agent has access to both the SPARQL endpoint and Linked
Data resource, what should it consider as the resource description?
There are at least 3 options:
1. prioritize SPARQL description over Linked Data
2. prioritize Linked Data description over SPARQL
3. merge both descriptions

I am leaning towards #3 as the sensible solution. But then I think the
end-user should be informed which part of the description came from
which source. This would be problematic if the descriptions are
triples only, but should be doable with quads. That leads to another
problem however, that both LD and SPARQL responses are under-specified
in terms of quads.

What do you think? Maybe this is a well-known issue, in which case
please enlighten me with some articles :)

Martynas
atomgraph.com
@atomgraphhq

Alasdair J G Gray
Fellow of the Higher Education Academy
Assistant Professor in Computer Science,
School of Mathematical and Computer Sciences
(Athena SWAN Bronze Award)
Heriot-Watt University, Edinburgh UK.

Email: A.J.G.Gray@hw.ac.uk<mailto:A.J.G.Gray@hw.ac.uk>
Web: http://www.macs.hw.ac.uk/~ajg33

ORCID: http://orcid.org/0000-0002-5711-4872

Office: Earl Mountbatten Building 1.39
Twitter: @gray_alasdair

Founded in 1821, Heriot-Watt is a leader in ideas and solutions. With campuses and students across the entire globe we span the world, delivering innovation and educational excellence in business, engineering, design and science.

The contents of this e-mail (including any attachments) are confidential. If you are not the intended recipient of this e-mail, any disclosure, copying, distribution or use of its contents is strictly prohibited, and you should please notify the sender immediately and then delete it (including any attachments) from your system.

Received on Wednesday, 8 June 2016 12:29:48 UTC