Re: Another attempt...

Andrew Newman wrote:
> On 18/03/2008, Richard Newman <rnewman@twinql.com> wrote:
>> Each of those nested {}s is a group graph pattern, not a graph
>>  pattern. Each of them is unique, but in any case what you wrote parses
>>  to this:
>>
>>  (Union
>>    (Union
>>      (Union
>>        (GroupGraphPattern)
>>        (GroupGraphPattern))
>>      (GroupGraphPattern))
>>    (GroupGraphPattern))
>>
>>  Note that those GroupGraphPatterns are empty (so whether they are sets
>>  or multisets doesn't matter), and there are four of them.
>>
> 
> How are they unique?
> 
> Union (multiset or no) is commutative and associative.  So it makes no
> difference what order I evaluate them in.  After I parse it I create
> something equivalent - UNION { {} {} {} {} } (a set) - where I just
> evaluate them in whatever order - it makes no difference if I evaluate
> it as UNION { {} }  or UNION { {} times a million }.  I'll still get
> the same result if I consider that one {} in one place is equal to {}
> in another.

SPARQL union (the operator in the algebra section) is a word with a definition 
in the spec.

There is also SPARQL UNION (the syntactic form).  The translation from syntax 
to algebra expression maps syntactic-UNION to algebra-union while at the same 
time mapping the arguments.

Neither of these are the results.  There is another step which is the 
evaluation of the algebra expression.

There seems to be skipping between these different stages going on. This seems 
to be hidden when talking about "meaning" and "represents".

>>  >> I'm not versed enough to label things universal or empty relations,
>>  >> but
>>  >> the evaluation of a SPARQL UNION is defined as the multiset-union
>>  >> of the
>>  >> evaluation of the two branches of the UNION. So:
>>  >>
>>  >> { A } UNION { { } }
>>  >>
>>  >> is multiset-union(eval(A), {{}}) -- that is, add the one empty
>>  >> solution
>>  >> to the solutions from evaluating A.
>>  >>
>>  >
>>  > It's just very odd behavior - and a bit inexplicable - especially the
>>  > multiple union of {{}}.
>>
>>
>> Multiset-union: duplicates are not discarded.
>>
> 
> But think about what {} means.  The SPARQL document says it represents
> the identity for JOIN.  So everytime you ask it for something it
> returns that - so it represents everything (universal set).  Now you
> union it with whatever, what do you have?   You have itself -
> everything.  You can't have more than that because by definition
> that's what it is.  It doesn't matter that it's multiset or not.
> Because the meaning of {} is everything.

Which "{}" are you referring to here?

1/ The empty multiset
2/ A syntax form in SPARQL that is the set of no triples patterns.


I don't understand the use of "means" and "represents".

Sec 5.2.1:
"""
The group pattern:

{ }

matches any graph (including the empty graph) with one solution that does not 
bind any variables.
"""
the keyword is "matches".

In sec 12, evaluation is defined and evaluation of the set of no triples 
patterns is not the empty set, it's the set of the empty set.

Evaluation is the process of going from algebra expression to muliset of results.

The join-identity is {{}} (set notation) - this follows from the definition of 
Join().

 > It doesn't make sense to
> have everything plus something.
> 
> So that's why I'm saying the result currently from SPARQL
> implementations (which I figure are just following ARQ) doesn't make
> sense.
> You are all obviously reading for the same page - but I'd like to know
> where that comes from.

The evaluation of a SPARQL query string is defined in the spec in section 12. 
  If you have an example where the spec says one thing and the implementations 
do another, then the implementations are wrong.

Do you have an example?  So far, you have stated what you want the answers to 
be but I don't see how that maps to the definitions.  Let's work through a 
single example from syntax to algebra expression to evaluation.  The reuse of 
{} can be confusing - but there are only a few delimiters on the keyboard.


Query string: { {?s ?p ?o} UNION {} }
Data: :x :p 1

Sec "12.2.1 Converting Graph Patterns"
Query string => algebra expression:

     (union
       (BGP (triple ?s ?p ?o))
       (BGP)
      )

Sec "12.5 Evaluation Semantics"

eval (union (BGP (triple ?s ?p ?o)), (BGP))
= union (eval(BGP (triple ?s ?p ?o)), eval((BGP)))


Sec "12.3 Basic Graph Patterns"

(BGP (triple ?s ?p ?o)) => { (?s, :x) (?p, :p) (?o, 1) }
(BGP) => { {} }

so:
eval (union (BGP (triple ?s ?p ?o)), (BGP))
=
   union ({ (?s, :x) (?p, :p) (?o, 1) }, { {} })

and from sec "12.4 SPARQL Algebra"

eval (union (BGP (triple ?s ?p ?o)), (BGP))
=
{
    { (?s, :x) (?p, :p) (?o, 1) },
    {}
}

Is there a step here that you think is wrong?
Which text in the spec?



It seems to me that there may be something else going on which is a desire for 
a different algebra (a relation algebra close to that of SQL perhaps) hidden 
behind using "represents" and "means".  That is a different argument, and not 
one about how the SPARQL algebra works, but instead about the design choices 
that have been made.  The implementations implement the spec via the defintion 
of the three step process of parsing the query string, conversion to the 
algebra expression, and evaluation of that expression.

	Andy

>>  Two things you might not expect: the output is a multiset, not a set
>>  (unless you apply DISTINCT), and UNION does not apply to its input, it
>>  applies to the output of its inputs when evaluated as query forms.
>>
>>  Does that help?
>>
> 
> I get that the output is a multiset by default and a set if DISTINCT
> is used.  I don't get what you mean by the "output of its inputs"
> (which of course may make my entire argument about look silly).

-- 
Hewlett-Packard Limited
Registered Office: Cain Road, Bracknell, Berks RG12 1HN
Registered No: 690597 England

Received on Tuesday, 18 March 2008 12:11:52 UTC