W3C home > Mailing lists > Public > public-rdf-dawg-comments@w3.org > March 2008

Re: Another attempt...

From: Andrew Newman <andrewfnewman@gmail.com>
Date: Tue, 18 Mar 2008 19:26:38 +1000
Message-ID: <2db5a5c40803180226k49a1d7efo33a9193f64047c11@mail.gmail.com>
To: "Richard Newman" <rnewman@twinql.com>
Cc: "Lee Feigenbaum" <lee@thefigtrees.net>, andy.seaborne@hp.com, "Arjohn Kampman" <arjohn.kampman@aduna-software.com>, "public-rdf-dawg-comments@w3.org" <public-rdf-dawg-comments@w3.org>

On 18/03/2008, Richard Newman <rnewman@twinql.com> wrote:
> Each of those nested {}s is a group graph pattern, not a graph
>  pattern. Each of them is unique, but in any case what you wrote parses
>  to this:
>
>  (Union
>    (Union
>      (Union
>        (GroupGraphPattern)
>        (GroupGraphPattern))
>      (GroupGraphPattern))
>    (GroupGraphPattern))
>
>  Note that those GroupGraphPatterns are empty (so whether they are sets
>  or multisets doesn't matter), and there are four of them.
>

How are they unique?

Union (multiset or no) is commutative and associative.  So it makes no
difference what order I evaluate them in.  After I parse it I create
something equivalent - UNION { {} {} {} {} } (a set) - where I just
evaluate them in whatever order - it makes no difference if I evaluate
it as UNION { {} }  or UNION { {} times a million }.  I'll still get
the same result if I consider that one {} in one place is equal to {}
in another.

>  >> I don't think the test suite is explicit in the proper result set for
>  >> projecting a variable that is not in the query. By my reading of the
>  >> spec, projection (http://www.w3.org/TR/rdf-sparql-query/
>  >> #modProjection
>  >> and http://www.w3.org/TR/rdf-sparql-query/#defn_algProjection) should
>  >> not introduce new variables into the output solution set.
>  >
>  > It certainly is puzzling - I would think it's an error at parse time -
>  > trying to project a variable that isn't in the WHERE clause.
>
> This is an optional error when run against my implementation, because
>  in practice it's something you'll never want to do.
>
>  However, you could consider that variables not mentioned in the WHERE
>  clause are always unbound; the output preserves the cardinality of the
>  unprojected rows, but introduces unbound columns.
>

Isn't this against the definition of PROJECT?  Aren't you selecting
variables in the WHERE clause - but they're not there.  What is the
meaning of a million unbound columns?

>  >> I'm not versed enough to label things universal or empty relations,
>  >> but
>  >> the evaluation of a SPARQL UNION is defined as the multiset-union
>  >> of the
>  >> evaluation of the two branches of the UNION. So:
>  >>
>  >> { A } UNION { { } }
>  >>
>  >> is multiset-union(eval(A), {{}}) -- that is, add the one empty
>  >> solution
>  >> to the solutions from evaluating A.
>  >>
>  >
>  > It's just very odd behavior - and a bit inexplicable - especially the
>  > multiple union of {{}}.
>
>
> Multiset-union: duplicates are not discarded.
>

But think about what {} means.  The SPARQL document says it represents
the identity for JOIN.  So everytime you ask it for something it
returns that - so it represents everything (universal set).  Now you
union it with whatever, what do you have?  You have itself -
everything.  You can't have more than that because by definition
that's what it is.  It doesn't matter that it's multiset or not.
Because the meaning of {} is everything.  It doesn't make sense to
have everything plus something.

So that's why I'm saying the result currently from SPARQL
implementations (which I figure are just following ARQ) doesn't make
sense.

You are all obviously reading for the same page - but I'd like to know
where that comes from.

>  Two things you might not expect: the output is a multiset, not a set
>  (unless you apply DISTINCT), and UNION does not apply to its input, it
>  applies to the output of its inputs when evaluated as query forms.
>
>  Does that help?
>

I get that the output is a multiset by default and a set if DISTINCT
is used.  I don't get what you mean by the "output of its inputs"
(which of course may make my entire argument about look silly).
Received on Tuesday, 18 March 2008 09:27:33 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 18 March 2008 09:27:34 GMT