RE: Compiling information from several different triplestores

Hi Nicolas,

This sounds like a typical federation plus rules problem. A federating query
processor will decompose the query into smaller queries against the
specified graphs -- effectively querying against the union of the graphs
without performing an actual costly union. Rules provide any necessary
ontology mapping. Assuming the rules are processed in a backwards chaining
manner, they will only be evaluated as necessary to answer the query so as
to minimize expense.

Here's an example of what I'm talking about using our Semantics.SDK for
.NET[1]: 

	prefix owl: <...>
	prefix ex: <...>
	prefix foaf: <...>

	#sparql extension to support rules
	rulebase (
		construct {?s ?p ?o} from {?x owl:sameAs ?s. ?x ?p ?o}
	)

	select ?f 
	from <http://www.someplace.org/data>
	from <http://www.someotherplace.org/data>
	where { ex:Anthony foaf:knows ?f }


To do this efficiently, the query processor will need statistics for the
data sources used. For remote graphs (e.g. sparql endpoints) this means that
they either need to publish stats in a reasonable form, or the query
processor would have to generate and cache its own based upon queries
against the graph. 

Hope that helps,

Geoff

[1] http://www.intellidimension.com/products/semantics-sdk/

-----Original Message-----
From: semantic-web-request@w3.org [mailto:semantic-web-request@w3.org] On
Behalf Of Nicolas Raoul
Sent: Tuesday, May 05, 2009 6:33 AM
To: semantic-web@w3.org
Subject: Compiling information from several different triplestores

Hello all,

How can I run a query over several different triplestores ?

For instance, I want to get a list of Anthony's friends.
Triplestore1 says Jack is Tony's friend.
Triplestore2 says Tony sameAs Anthony.
What clever mechanism would undestand that Jack is Anthony's friend?
Do I have to copy all information from both triplestores my own
triplestore, or is there something smarter to do ?

Copying all information from external triplestores seems awkward, and
in some cases might prove impossible (frequent updates, size, load on
servers).
Is there an easy solution that I am not aware of?
Can any triplestore implementation be configured to complement its
information with information from external triplestores?

Thank you!
Nicolas Raoul
http://nrw.free.fr

Received on Tuesday, 5 May 2009 15:09:16 UTC