W3C home > Mailing lists > Public > public-rdf-dawg-comments@w3.org > March 2008

Re: Another attempt...

From: Richard Newman <rnewman@twinql.com>
Date: Mon, 17 Mar 2008 22:51:43 -0700
Cc: "Lee Feigenbaum" <lee@thefigtrees.net>, andy.seaborne@hp.com, "Arjohn Kampman" <arjohn.kampman@aduna-software.com>, "public-rdf-dawg-comments@w3.org" <public-rdf-dawg-comments@w3.org>
Message-Id: <FE77F43B-34DD-47FB-ADDF-A45126B76D67@twinql.com>
To: "Andrew Newman" <andrewfnewman@gmail.com>

>> But if I were defining it, I'd say that a group pattern is a pair
>> consisting of a set of zero or more graph patterns and a set of  
>> zero or
>> more filter expressions. The empty group graph pattern then, is  
>> ({}, {})
>> - an empty set of graph patterns and an empty set of filters. For
>> simplicity, I'll omit the filters from the rest of this discussion,  
>> so
>> the empty graph pattern is indeed empty: {}.
>>
>
> I'm not sure how it's a set - because it seems as if in SPARQL you
> collecting them all.  For example I can write in SPARQL syntax:
>
> SELECT ?x
> WHERE { {} UNION {} UNION {} UNION {} }
>
> If it were a set when SPARQL was being evaluated there would be 1
> result but it returns 3.

Each of those nested {}s is a group graph pattern, not a graph  
pattern. Each of them is unique, but in any case what you wrote parses  
to this:

(Union
   (Union
     (Union
       (GroupGraphPattern)
       (GroupGraphPattern))
     (GroupGraphPattern))
   (GroupGraphPattern))

Note that those GroupGraphPatterns are empty (so whether they are sets  
or multisets doesn't matter), and there are four of them.

>> I don't think the test suite is explicit in the proper result set for
>> projecting a variable that is not in the query. By my reading of the
>> spec, projection (http://www.w3.org/TR/rdf-sparql-query/ 
>> #modProjection
>> and http://www.w3.org/TR/rdf-sparql-query/#defn_algProjection) should
>> not introduce new variables into the output solution set.
>
> It certainly is puzzling - I would think it's an error at parse time -
> trying to project a variable that isn't in the WHERE clause.

This is an optional error when run against my implementation, because  
in practice it's something you'll never want to do.

However, you could consider that variables not mentioned in the WHERE  
clause are always unbound; the output preserves the cardinality of the  
unprojected rows, but introduces unbound columns.
>
>
>> I'm not versed enough to label things universal or empty relations,  
>> but
>> the evaluation of a SPARQL UNION is defined as the multiset-union  
>> of the
>> evaluation of the two branches of the UNION. So:
>>
>> { A } UNION { { } }
>>
>> is multiset-union(eval(A), {{}}) -- that is, add the one empty  
>> solution
>> to the solutions from evaluating A.
>>
>
> It's just very odd behavior - and a bit inexplicable - especially the
> multiple union of {{}}.

Multiset-union: duplicates are not discarded.

Perhaps a better way to explain it is by reduction:

{ A } UNION { B }, where B is an empty group graph pattern (expressed  
in SPARQL syntax as "{ }")
=
multiset-union(eval(A), eval(B))
   ...
   eval(B) => one empty solution
   eval(A) => whatever result it gives
   ...
=
multiset-union([sol. a, sol. b, ...], [empty solution])

I tried to avoid using {}, so [] stand as multiset delimiters.

>>> The added confusion is that I don't understand the current SPARQL
>>> result of UNIONing {} in SPARQL as you end up with something that is
>>> neither a usual result nor an identity but a combination of the two
>>> (which is where the conversation started).  It seems to be
>>> correct/valid to keep collecting these empty sets (unless you
>>> eliminate them with a distinct), what does that mean?
>>
>> I don't understand the question - maybe a test case would clarify?
>>
>
> Hopefully the query at the top of this response is good enough.  It's
> a bit hard to work out now, though, where the problem lies.  Is it the
> result serialization, the graph pattern syntax and/or the definition
> of the empty graph pattern?

I think your confusion lies in where things are evaluated, and what  
are sets.

{} UNION {} does not mean "the union of two empty sets". It means "the  
multiset-union of the results of evaluating two empty graph patterns".  
Each empty graph pattern yields one empty set of results bindings, so  
you end up with two such sets in the output -- remember, the output is  
a multiset which is converted into an ordered list.

This is exactly what twinql gives:

sparql(6): (run-sparql "SELECT * { {} UNION {} }")
<?xml version="1.0"?>
<sparql xmlns="http://www.w3.org/2005/sparql-results#">
   <head>
   </head>
   <results>
     <result>
     </result>
     <result>
     </result>
   </results>
</sparql>

Two things you might not expect: the output is a multiset, not a set  
(unless you apply DISTINCT), and UNION does not apply to its input, it  
applies to the output of its inputs when evaluated as query forms.

Does that help?
Received on Tuesday, 18 March 2008 05:58:50 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 18 March 2008 05:58:50 GMT