distributed queries again from Phil Dawes on 2004-04-28 (www-rdf-interest@w3.org from April 2004)

From: Phil Dawes <pdawes@users.sourceforge.net>
Date: Wed, 28 Apr 2004 18:07:15 +0100
To: www-rdf-interest@w3.org
Message-ID: <16527.58563.854214.800566@gargle.gargle.HOWL>

Hi All,

We currently have a single sesame RDF store at work through which we
do all queries. This isn't going to scale to all the data in the
business, and so in my spare time I've been considering how to address
distributed querying of RDF Stores.

Having read Patrick's RDFQ spec, I hit upon his notion of adding RDF
assertions to the query to be considered in addition to the knowledge
in the RDF store. This sounds like a workable part of the solution to
me - e.g. query one store with a subsection of the query, then query
another store with the whole query, augumenting the store data with
the results of the first query.

So the difficult bit is choosing which stores to query, and deciding
automatically how to chop up the query to run against the stores most
effectively. Ideally I need to be able to generalize about the sorts
of data in the repository and the sorts of queries it can answer.

As most of our rdf data comes automatically from existing relational
databases and directories, there are a lot of statements about
instances of common types. This is because the database is usually
setup to support a process or application, and so the data is
concerned with collating information around a set of connected
concepts - an ontology.

So I was thinking of generalising the statements in the rdfstore by
describing a set of type-based-templates that the stamements
match. Something like:

mystore a rdfstore
 :containsStatementsMatching [subjecttype st;  pred p; objecttype ot],
                             [subjecttype st2; pred p; objecttype ot2],
                             ...;
                             [subjecttype stn; pred p; objecttype otn].


The store chooser then attempts to match the query templates against
the store templates to deduce which store(s) to use. (N.B. it needs to
know the types of the subjects and objects, but this information is
easy to come by - we have an HTTP-GET URIQA-a-like service for this
sort of thing).

e.g. to do:

select ?foo
(?foo, z:location, :southBankHouse)

it finds out that a:southBankHouse a a:building and searches the
er.. meta-store with the query:

select ?store
(?store :containsStatementsMatching ?template)
(?template pred z:location)
(?template objecttype a:building)

Does this sound like a workable approach? 
Any refinements? Any better ideas?

Cheers,

Phil

Received on Wednesday, 28 April 2004 14:09:05 UTC