- From: Andrew Newman <andrewfnewman@gmail.com>
- Date: Tue, 18 Mar 2008 19:26:38 +1000
- To: "Richard Newman" <rnewman@twinql.com>
- Cc: "Lee Feigenbaum" <lee@thefigtrees.net>, andy.seaborne@hp.com, "Arjohn Kampman" <arjohn.kampman@aduna-software.com>, "public-rdf-dawg-comments@w3.org" <public-rdf-dawg-comments@w3.org>
On 18/03/2008, Richard Newman <rnewman@twinql.com> wrote:
> Each of those nested {}s is a group graph pattern, not a graph
> pattern. Each of them is unique, but in any case what you wrote parses
> to this:
>
> (Union
> (Union
> (Union
> (GroupGraphPattern)
> (GroupGraphPattern))
> (GroupGraphPattern))
> (GroupGraphPattern))
>
> Note that those GroupGraphPatterns are empty (so whether they are sets
> or multisets doesn't matter), and there are four of them.
>
How are they unique?
Union (multiset or no) is commutative and associative. So it makes no
difference what order I evaluate them in. After I parse it I create
something equivalent - UNION { {} {} {} {} } (a set) - where I just
evaluate them in whatever order - it makes no difference if I evaluate
it as UNION { {} } or UNION { {} times a million }. I'll still get
the same result if I consider that one {} in one place is equal to {}
in another.
> >> I don't think the test suite is explicit in the proper result set for
> >> projecting a variable that is not in the query. By my reading of the
> >> spec, projection (http://www.w3.org/TR/rdf-sparql-query/
> >> #modProjection
> >> and http://www.w3.org/TR/rdf-sparql-query/#defn_algProjection) should
> >> not introduce new variables into the output solution set.
> >
> > It certainly is puzzling - I would think it's an error at parse time -
> > trying to project a variable that isn't in the WHERE clause.
>
> This is an optional error when run against my implementation, because
> in practice it's something you'll never want to do.
>
> However, you could consider that variables not mentioned in the WHERE
> clause are always unbound; the output preserves the cardinality of the
> unprojected rows, but introduces unbound columns.
>
Isn't this against the definition of PROJECT? Aren't you selecting
variables in the WHERE clause - but they're not there. What is the
meaning of a million unbound columns?
> >> I'm not versed enough to label things universal or empty relations,
> >> but
> >> the evaluation of a SPARQL UNION is defined as the multiset-union
> >> of the
> >> evaluation of the two branches of the UNION. So:
> >>
> >> { A } UNION { { } }
> >>
> >> is multiset-union(eval(A), {{}}) -- that is, add the one empty
> >> solution
> >> to the solutions from evaluating A.
> >>
> >
> > It's just very odd behavior - and a bit inexplicable - especially the
> > multiple union of {{}}.
>
>
> Multiset-union: duplicates are not discarded.
>
But think about what {} means. The SPARQL document says it represents
the identity for JOIN. So everytime you ask it for something it
returns that - so it represents everything (universal set). Now you
union it with whatever, what do you have? You have itself -
everything. You can't have more than that because by definition
that's what it is. It doesn't matter that it's multiset or not.
Because the meaning of {} is everything. It doesn't make sense to
have everything plus something.
So that's why I'm saying the result currently from SPARQL
implementations (which I figure are just following ARQ) doesn't make
sense.
You are all obviously reading for the same page - but I'd like to know
where that comes from.
> Two things you might not expect: the output is a multiset, not a set
> (unless you apply DISTINCT), and UNION does not apply to its input, it
> applies to the output of its inputs when evaluated as query forms.
>
> Does that help?
>
I get that the output is a multiset by default and a set if DISTINCT
is used. I don't get what you mean by the "output of its inputs"
(which of course may make my entire argument about look silly).
Received on Tuesday, 18 March 2008 09:27:33 UTC