Re: federation use case from Alberto Reggiori on 2004-07-28 (public-rdf-dawg@w3.org from July to September 2004)

From: Alberto Reggiori <alberto@asemantics.com>
Date: Wed, 28 Jul 2004 16:15:12 +0200
To: andy.seaborne@hp.com
Cc: Eric Prud'hommeaux <eric@w3.org>, RDF Data Access Working Group <public-rdf-dawg@w3.org>
Message-Id: <8A543BCE-E0A0-11D8-AB02-0003939CA324@asemantics.com>

On Jul 28, 2004, at 3:13 PM, Seaborne, Andy wrote:

>
> Could this be addressed with a general paramterised queries (see also  
> [1]) mechanism?  If CDDB and IMDB were on the same site it would be  
> nice to flow the paramters from the first query to the second without  
> a round trip back to the client.

interesting to see how recently the federation use case seems to come  
up again and still of interest to people here - which reminds me about  
the discussion started sometime ago

	http://lists.w3.org/Archives/Public/public-rdf-dawg/2004JanMar/ 
0062.html

which was probably not down-to-earth enough and less clear - while  
EricP example instead gives a more concrete scenario about a possible  
CDDB and IMDB data federation.

In some BRQL dialect I would rewrite EricP Algae federated query as

CONSTRUCT

	(?t foo bar)

WHERE

       SOURCE ?cddb (?m tt:rank ?rc)
       SOURCE ?cddb (?m cddb:soundtrack ?s)
       SOURCE ?cddb (?s dc:title ?t)
       (?cddb dc:source <http://somedb.cddb.org> ) // CDDB source

       SOURCE ?imdb (?r tt:rank ?ri)
       SOURCE ?imdb (?r dc:title ?t)
       (?imdb dc:source  
<ftp://ftp.fu-berlin.de/pub/misc/movies/database/> ) // IMDB source

AND ?rc <= 10 && ?ri <= 10

and would apply the constraints after the federated join has happened -  
and dc:source might be used as special property to bind BRQL SOURCEs to

any thought?

Yours

Alberto

>
> 	Andy
>
> [1]  
> http://lists.w3.org/Archives/Public/public-rdf-dawg/2004JulSep/ 
> 0090.html
>
> Eric Prud'hommeaux wrote:
>
>> Joe wants to see what top 10 movies of also had top-ten soundtracks.
>> IMDB has information about movies and CDDB has info about music. Joe
>> writes a query that gets the titles of all the top 10 movies. These
>> are boudn to the variable ?t. He uses those bindings for ?t to then
>> query IMDB to filter out the ones that did not have top 10
>> soundtracks.
>> from CDDB:
>> CONSTRUCT (?t foo bar)
>> WHERE (?m tt:rank ?rc)
>>       (?m cddb:soundtrack ?s)
>>       (?s dc:title ?t)
>> AND ?rc <= 10
>> from IMDB:
>> SELECT ?t
>> WHERE (?r tt:rank ?ri)
>>       (?r dc:title ?t)
>> AND ?ri <= 10
>> This needs the ability to use variables bound in an earlier query to
>> constrain later queries. It also requires some sort of query
>> targeting. In algae, this looks like:
>> ns tt=<...> ns cddb=<...> ns dc=<...> attach  
>> <http://www.w3.org/...#remoteQuery> ?cddb (
>> 			server=<http://cddb.com/rq>)
>> ask ?cddb ( ?m tt:rank ?rc {?rc <= 10} .
>> 	    ?m cddb:soundtrack ?s .
>> 	    ?s dc:title ?t )
>> attach <http://www.w3.org/...#remoteQuery> ?imdb (
>> 			server=<http://imdb.com/querySrvc>)
>> ask ?cddb ( ?r tt:rank ?ri {?ri <= 10} .
>> 	    ?r dc:title ?t )
>> collect (?t)
>> This is very different from (more rigorous and expensive than) our
>> current definition of aggregate query [1]. Such a query would, if the
>> data is divided as the above queries suggest, return zero results.
>> [1] http://www.w3.org/2001/sw/DataAccess/UseCases#d4.5

Received on Wednesday, 28 July 2004 10:15:02 UTC