RE: SOURCE - simple test cases from Seaborne, Andy on 2004-09-27 (public-rdf-dawg@w3.org from July to September 2004)

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Mon, 27 Sep 2004 18:06:49 +0100
To: "Steve Harris" <S.W.Harris@ecs.soton.ac.uk>, <public-rdf-dawg@w3.org>
Message-ID: <8D5B24B83C6A2E4B9E7EE5FA82627DC9396712@sdcexcea01.emea.cpqcorp.net>

-------- Original Message --------
> From: Steve Harris <>
> Date: 27 September 2004 17:45
> 
> On Mon, Sep 27, 2004 at 04:57:30PM +0100, Andy Seaborne wrote:
> > -----------------------------
> > 
> > ---- Graph <u1>
> > > a :b :c
> > 
> > ---- Graph <u2>
> > > a :b :c
> > 
> > -----------------------------
> > 
> > ---- Query 1
> > SELECT *
> > FROM <u1>, <u2>
> > WHERE (?x ?y ?z)
> > 
> > 
> > I expect one result:
> > 
> > ?x = :a , ?y = :b , ?z = :c
> 
> That strikes me as odd; if you had
> 
> > a :b :c
> > a :d :c
> 
> and queried for
> 
> SELECT ?s ?o WHERE (?s ?p ?o)
> 
> you would get two results (with no DISTINCT, I dont htink we've
> discussed 
> that), adding ?p to the SELECT would still get you two, so I would
> expect 
> the same behaviour from
> 
> SELECT ?s ?p ?o WHERE SOURCE ?src (?s ?p ?o)
> v's
> SELECT ?src ?s ?p ?o WHERE SOURCE ?src (?s ?p ?o)
> 
> ie. I wouldn't expect changing the variables SELECTed to change the
> number 
> of results.

I agree that changing just the SELECTed variables should not change the
number of results.  

The first example didn't have a "SOURCE" in it - it's a pure query if
the RDF statements.  It came out as one result, not because of a hidden
DISTINCT, buit because the query sees the two sources as an RDF merge
and its RDF core that says there is one statement there.  When changing
the data to two different statements, I'd expect the results to change.

If the original data were somehow:

:a :b :c
:a :b :c

I'd expect one result.  

The first query, with no reference to SOURCE, accesses the combined
statements as if a single RDF graph.  This is the difference between
quads and triples so I claim having done my action item :-)

The first example is there to show that introducing a new part of the
query (here SOURCE ?src) which wasn't in the first one, changes the
number of results.   The fact it increases might be seen as a bit odd,
but the fact that changing the WHERE clause changes the number of
results seems reasonable to me.

Alternative viewpoint: this is union query vs aggregate query.  i.e.
Unioning/concatenating the results, vs merging the graphs and querying
the merge.

In my example query 2 (sorry - I meant 2a and 2b but missed in an edit)
you do get the same number of answers when the number of variables
SELECTed changes.  But the data access part is exactly the same in each
case, both have a SOURCE:

=================================================
---- Query 2

SELECT *
FROM <u1>, <u2>
WHERE SOURCE ?src (?x ?y ?z)

Here, I'd expect that there are two results:

?src=<u1> , ?x = :a , ?y = :b , ?z = :c
?src=<u2> , ?x = :a , ?y = :b , ?z = :c

. . . 

---- Query 2

SELECT ?x ?y ?z
FROM <u1>, <u2>
WHERE SOURCE ?src (?x ?y ?z)

. . . 

?x = :a , ?y = :b , ?z = :c
?x = :a , ?y = :b , ?z = :c

=================================================

> 
> - Steve

	Andy

Received on Monday, 27 September 2004 17:07:21 UTC