- From: Andrew Newman <andrewfnewman@gmail.com>
- Date: Tue, 18 Mar 2008 19:26:38 +1000
- To: "Richard Newman" <rnewman@twinql.com>
- Cc: "Lee Feigenbaum" <lee@thefigtrees.net>, andy.seaborne@hp.com, "Arjohn Kampman" <arjohn.kampman@aduna-software.com>, "public-rdf-dawg-comments@w3.org" <public-rdf-dawg-comments@w3.org>
On 18/03/2008, Richard Newman <rnewman@twinql.com> wrote: > Each of those nested {}s is a group graph pattern, not a graph > pattern. Each of them is unique, but in any case what you wrote parses > to this: > > (Union > (Union > (Union > (GroupGraphPattern) > (GroupGraphPattern)) > (GroupGraphPattern)) > (GroupGraphPattern)) > > Note that those GroupGraphPatterns are empty (so whether they are sets > or multisets doesn't matter), and there are four of them. > How are they unique? Union (multiset or no) is commutative and associative. So it makes no difference what order I evaluate them in. After I parse it I create something equivalent - UNION { {} {} {} {} } (a set) - where I just evaluate them in whatever order - it makes no difference if I evaluate it as UNION { {} } or UNION { {} times a million }. I'll still get the same result if I consider that one {} in one place is equal to {} in another. > >> I don't think the test suite is explicit in the proper result set for > >> projecting a variable that is not in the query. By my reading of the > >> spec, projection (http://www.w3.org/TR/rdf-sparql-query/ > >> #modProjection > >> and http://www.w3.org/TR/rdf-sparql-query/#defn_algProjection) should > >> not introduce new variables into the output solution set. > > > > It certainly is puzzling - I would think it's an error at parse time - > > trying to project a variable that isn't in the WHERE clause. > > This is an optional error when run against my implementation, because > in practice it's something you'll never want to do. > > However, you could consider that variables not mentioned in the WHERE > clause are always unbound; the output preserves the cardinality of the > unprojected rows, but introduces unbound columns. > Isn't this against the definition of PROJECT? Aren't you selecting variables in the WHERE clause - but they're not there. What is the meaning of a million unbound columns? > >> I'm not versed enough to label things universal or empty relations, > >> but > >> the evaluation of a SPARQL UNION is defined as the multiset-union > >> of the > >> evaluation of the two branches of the UNION. So: > >> > >> { A } UNION { { } } > >> > >> is multiset-union(eval(A), {{}}) -- that is, add the one empty > >> solution > >> to the solutions from evaluating A. > >> > > > > It's just very odd behavior - and a bit inexplicable - especially the > > multiple union of {{}}. > > > Multiset-union: duplicates are not discarded. > But think about what {} means. The SPARQL document says it represents the identity for JOIN. So everytime you ask it for something it returns that - so it represents everything (universal set). Now you union it with whatever, what do you have? You have itself - everything. You can't have more than that because by definition that's what it is. It doesn't matter that it's multiset or not. Because the meaning of {} is everything. It doesn't make sense to have everything plus something. So that's why I'm saying the result currently from SPARQL implementations (which I figure are just following ARQ) doesn't make sense. You are all obviously reading for the same page - but I'd like to know where that comes from. > Two things you might not expect: the output is a multiset, not a set > (unless you apply DISTINCT), and UNION does not apply to its input, it > applies to the output of its inputs when evaluated as query forms. > > Does that help? > I get that the output is a multiset by default and a set if DISTINCT is used. I don't get what you mean by the "output of its inputs" (which of course may make my entire argument about look silly).
Received on Tuesday, 18 March 2008 09:27:33 UTC