RE: DISTINCT (was: Re: Queries over multiple graphs)

-------- Original Message --------
> From: Steve Harris <>
> Date: 28 September 2004 20:56
> 
> On Tue, Sep 28, 2004 at 06:19:29 +0100, Andy Seaborne wrote:
> > I prefer to have explicit DISTINCT.  I don?t see having SELECT
> > returning duplicate rows contradicting RDF's set of statements if
the
> > app writer only wants some of the variables. 
> > 
> > If there is no DISTINCT, then there is there is one result for every
> > way the query can be matched.  Because SELECT can remove variables,
> > it is possible the application can't tell two solutions (table rows,
> > results) apart - but it can if there is "SELECT *" or SELECT with
all
> > the variables. "SELECT DISTINCT" means no two results the same even
> > when there fewer variables.  Hence "SELECT DISTINCT *" is a no-op.
> 
> Doesnt that assume that every statement in the system is unique at the
> triple level? That is not neccesarily the case.

In the sense that an RDF graph is a set of statements, every statement
is unique. When querying "SELECT *" there will be one unique solution
for each way the query can match.  Hence each row is different in some
way.

If I understand 3Store correctly, it is as much a collection of graphs
to query - and does not present a concept of the RDF model of the whole
collection.  It's more like having an implicit "SOURCE *" around each
query pattern.

> 
> > Follwoign on, for optionals, thsi approach suggests a style of one
> > query result row for each way a query can match.
> > 
> > E.g.: Separate optional blocks, separate variables:
> > 
> > OPTIONAL ... ?x ....    // does not use ?y
> > OPTIONAL ... ?y ....    // does not use ?x
> > 
> > gives
> > 
> > ?x = ...    // and no ?y
> > ?y = ...    // and no ?x
> 
> What about ?x=NULL, ?y=NULL and ?x=... , ?y=..., would those also be
> valid 
> solutions? I think I'm not following this part. Possibly a more
concrete
> example would help.

Yes - an example would help - and I think I got the example wrong.
OPTIOANL is "greedy" in that if it can bind it does.  No unbound is
generated if an OPTIONAL can match (A [B] is A+B if B matches else A).
I'll try to do an example in a sparate mail thread.

-----
Just to be clear, to me, NULLs are just a way the results can be
presented. They indicate "not bound" in the same way as being absent
would.  I'd expect an SQL-ish interface style to have nulls.

Sorry for the confusion,

> 
> - Steve

Received on Wednesday, 29 September 2004 12:06:10 UTC