RE: Ambiguity and 4.5 Aggregate Query (and screw case)

-------- Original Message --------
> From: Kendall Clark <>
> Date: 29 June 2004 14:16
> 
> On Tue, Jun 29, 2004 at 07:03:24AM -0400, Eric Prud'hommeaux wrote:
> > [[
> > 4.5 Aggregate Query
> > 
> > It should be possible to specify two or more RDF graphs against which
> > a query shall be executed; that is, the result of an aggregate query
> > is the merge of the results of executing the query on each of two or
> > more graphs. ]]
> > 
> > This states that we perform the query on each of the (two or more)
> > graphs. My intuition is that we want to perform the query on the
> > aggregation of the graphs. For instance,
> 
> > Maybe this was already decided in favor of the former; I don't recall.
> 
> We discussed this originally, as I recall. Aggregate, then query is
> distinct from query separately, aggregate results. I called what
> you're proposing "union query". Again, as I recall the discussion,
> there was more support for aggregate query than union query.
> 
> Alas, I'm too lazy just now to hunt in the archives to confirm my
> memory.
> 
> Kendall Clark

There seems to me to be no need for explicit support for union query.   If
the union is valuable, then make the union an identifiable web resource and
query that.  In other words, the query names the union as the target and
there is no need to have any thing in the QL or protocol.

What this approach to union query does not permit is arbitrary, temporary
union.  As fetching a graph over the web is not trivial, having a server
which allowed client request to cause many large GETs (if implemented by
merge locally and query) to happen seems OK for small experiments only.  An
implementation could be done which asked each triple pattern in turn (with
previous triple matching values substituted in - it's a search tree here not
a linear pass) avoid GETting the whole models but cause very large numbers
of request from the request server to the model owner's servers.

This can't be done with aggregate result query - the system would need a way
to name the separate graphs if it isn't done by the client issuing request
to each target and merging the results itself.  This is about the same
amount of data traffic if the results aren't having duplicates removed -
only extra copies of the query go out; it may be faster for the client to do
it as requests can be sent in parallel (network speed impacts this).

	Andy

Received on Wednesday, 30 June 2004 08:31:03 UTC