Re: Another attempt... from Lee Feigenbaum on 2008-03-18 (public-rdf-dawg-comments@w3.org from March 2008)

From: Lee Feigenbaum <lee@thefigtrees.net>
Date: Mon, 17 Mar 2008 21:28:21 -0400
To: Andrew Newman <andrewfnewman@gmail.com>
CC: andy.seaborne@hp.com, Arjohn Kampman <arjohn.kampman@aduna-software.com>, "public-rdf-dawg-comments@w3.org" <public-rdf-dawg-comments@w3.org>
Message-ID: <47DF1AB5.5030106@thefigtrees.net>
Hi Andrew,

I think that you still might be confusing curly-brace SPARQL syntax with 
curly-brace set notation. They're two very different things that 
(unfortunately?) use the same character. I've tried to clarify inline below.

Andrew Newman wrote:
> Here it is, one more time from the top...I hope someone follows my
> reasoning enough to at least point out the mistake I'm making.
> 
> The definition in 5.2.1 "one solution that does not bind any
> variables" seems to be contradictory of the definition of the
> universal set/universal relation and the definition in 12.3 and it's
> not what happens when you use it in ARQ.
> 
> So is the empty group pattern a good name, is it really empty - it has
> a cardinality of 1 - isn't the *really* empty set (or multiset) have a
> cardinality of 0 and is trivially false/empty set?

Group patterns are components of the SPARQL abstract syntax. In the 
concrete SPARQL syntax, they are demarcated by curly braces. The empty 
group pattern is written in SPARQL concrete syntax as {}. The spec. 
doesn't have a definition of a group graph pattern as such, other than 
the syntactic definition at:

http://www.w3.org/TR/rdf-sparql-query/#rGroupGraphPattern

But if I were defining it, I'd say that a group pattern is a pair 
consisting of a set of zero or more graph patterns and a set of zero or 
more filter expressions. The empty group graph pattern then, is ({}, {}) 
- an empty set of graph patterns and an empty set of filters. For 
simplicity, I'll omit the filters from the rest of this discussion, so 
the empty graph pattern is indeed empty: {}.

Note that there's nothing here about solution sets, solutions, bindings, 
or relations. It's just part of the SPARQL syntax at this point. A group 
pattern can be evaluated in the context of an RDF dataset to give zero 
of more solutions (a solution set). Each solution is itself a set of 
pairs (bindings), in which the first element of each pair is a variable 
and the second element of each pair is an RDF term.

> Furthermore, does it really not bind to any variables?  When you use
> it in a query:
> SELECT ?x
> WHERE {{}}
> 
> You get back: {?x -> {}}

Right, the evaluation of the empty group graph pattern ({} in SPARQL 
syntax) is a non-empty solution set. Specifically, as per 5.2.1, the 
evaluation of the empty group graph pattern against any RDF dataset is 
the solution set {...} containing a single solution {{...}} containing 
no bindings: {{}}

The key here is this:

   eval({}) -> {{}}

The curlies on the LHS of that line are SPARQL syntax for an empty group 
graph pattern and the curlies on the right are standard mathematical set 
notation (a set containing only the empty set).

I don't think the test suite is explicit in the proper result set for 
projecting a variable that is not in the query. By my reading of the 
spec, projection (http://www.w3.org/TR/rdf-sparql-query/#modProjection 
and http://www.w3.org/TR/rdf-sparql-query/#defn_algProjection) should 
not introduce new variables into the output solution set.

Why then, do you see a column for 'x' when you run a query like

SELECT ?x
FROM <http://thefigtrees.net/lee/sw/data/dg.n3>
WHERE {{}}

at http://sparql.org/sparql.html?

Andy would have to speak authoritatively to it, but my guess is that 
it's just because this is a visualization (via XSLT) of the SPARQL XML 
Result Format. The XML result format specification says 
(http://www.w3.org/TR/rdf-sparql-XMLres/#head)

"""
For a variable binding query result, head must contain a sequence of 
elements describing the set of Query Variable names in the Solution 
Sequence (here called query results).

The order of the variable names in the sequence is the order of the 
variable names given to the argument of the SELECT statement in the 
SPARQL query. If SELECT * is used, the order of the names is undefined.
"""

In this case, this seems a bit contradictory: ?x is _not_ in the 
solution sequence (first paragraph), but it is in the SELECT statement 
(second paragraph).

This may be an ambiguity in the SPARQL XML Result Format.

> You add ?a, ?b, ?c and so on and you get back those binding to {} as
> well - so that would suggest to me that it binds to every variable -
> rather than not binding to any variable.  If it is binding to
> everything then it is consistent (with my idea) of the universal set.
 >
 > A related question is why bother?  Why not return trivially true back
 > instead of binding it to any variables at all - isn't that equivalent?

Again, adding the projected variables doesn't change the solution set 
which remains (set notation): {{}}

> If it is the universal set then UNIONing it with anything - it should
> also be trivially true - if it contains all possible bindings then
> UNIONing it with anything returns itself.  Much like TRUE OR X in
> boolean logic - it doesn't matter what X is - because with OR you
> don't even have to look at X you just return TRUE.

I'm not versed enough to label things universal or empty relations, but 
the evaluation of a SPARQL UNION is defined as the multiset-union of the 
evaluation of the two branches of the UNION. So:

{ A } UNION { { } }

is multiset-union(eval(A), {{}}) -- that is, add the one empty solution 
to the solutions from evaluating A.

> If it's not, how can it be the identity of JOIN?  What is happening
> when JOINing with the empty group pattern?  Isn't it finding a μ1 from
> Ω0 that is compatible with μ2 in Ω2?  Again, the identity of JOIN is
> something that is binding to everything - rather than something that
> doesn't.

Right. If the empty group graph pattern is the right-hand side of the 
join, then for every solution in Ω1 you're joining that solution with 
the only solution there is in Ω2: {}. (Remember, Ω2 is the evaluation of 
the empty group graph pattern, which is {{}}. The only solution in the 
solution set {{}} is the empty solution, {}. This empty solution, {}, is 
compatible with all other solutions and is the identity for merging two 
solutions. merge(S, {}) = S.

> Like I said, this is only to make it consistent with boolean, set, bag
> or relational algebra.  I'm not sure anyone knows exactly what an
> algebra is so this behavior might be arbitrary.

It may be inconsistent. I can't say for sure one way or the other.

> The added confusion is that I don't understand the current SPARQL
> result of UNIONing {} in SPARQL as you end up with something that is
> neither a usual result nor an identity but a combination of the two
> (which is where the conversation started).  It seems to be
> correct/valid to keep collecting these empty sets (unless you
> eliminate them with a distinct), what does that mean?

I don't understand the question - maybe a test case would clarify?

Lee

> I haven't seen a mixture of this kind so it'd be useful if there were
> a few pointers or explanation around it because I think it is
> different to most other algebras.
Received on Tuesday, 18 March 2008 01:29:10 UTC