Re: Another attempt... from Andrew Newman on 2008-03-25 (public-rdf-dawg-comments@w3.org from March 2008)

From: Andrew Newman <andrewfnewman@gmail.com>
Date: Wed, 26 Mar 2008 04:00:03 +1000
To: "Seaborne, Andy" <andy.seaborne@hp.com>
Cc: "public-rdf-dawg-comments@w3.org" <public-rdf-dawg-comments@w3.org>
Message-ID: <2db5a5c40803251100p10cf0fd1w3337454e9dea67d0@mail.gmail.com>
On 25/03/2008, Seaborne, Andy <andy.seaborne@hp.com> wrote:
> In [2008Mar/0013] there is a complete worked example with "UNION {}" and it shows how the results are obtained by referencing the SPARQL specification at each step.  The purpose of the example in [2008Mar/0013] is so we can find the first point of difference or first point where pre-requisite knowledge [2008Mar/0029] is used.
>

Here are a list of my suggestions.

Renaming Ω0 to Ω1 and Ω0 becomes:
"Write Ω0 for the multiset consisting of no mappings, with cardinality
0.  It's not expressible in SPARQL syntax.  It is the UNION identity."

The definition of Ω1 (currently Ω0):
"Write Ω1 for the multiset consisting of exactly the empty mapping μ0,
with cardinality 1. This is the join identity, it is the empty graph
pattern and as a solution mapping it is represented as { {} }".

If you don't like having Ω0 it doesn't matter - please consider the
new definition of Ω1 (the old Ω1) separately.

Maybe make it clear that in SPARQL:
* X JOIN Ω1 = X and X UNION Ω0 = X but that
* X UNION Ω1 != Ω1 and X JOIN Ω0 != Ω0.

Another confusion with UNION in the SPARQL specification was the
difference between it and the Perez paper - set union not multiset/bag
union.  The specification switches between multiset and set union when
one definition would've done.  But I think that's probably outside the
scope of the changes.

Other changes not related to UNION.

I'd also change in 12.3:
"Two solution mappings μ1 and μ2 are compatible if, for every variable
v in dom(μ1) and in dom(μ2), μ1(v) = μ2(v)."

Again, maybe add the stuff from the Perez paper about two disjoint
domains and μ0 being compatible with everything - much less to derive
- I was doing set membership instead (as that is what occurs in
relational JOIN) - which seems very dumb now but it wasn't clear at
the time - I haven't been able to misinterpret the Perez paper in the
same way.  Using "for every" is a standard but maybe "for all" makes
it a bit more clear.

>  In [2008Mar/0015] you said of that example:
>
> """
>  It simple to evaluate you don't need any
>  steps if you are UNIONing the identity for JOIN.
>  """
>
>
> I can't find mention of this - which text in the definition of the evaluation of UNION does this refer to?  (For quick reference: the specification gives the definition for union of two solution multi-sets at [1]; the join identity is the multiset {{}}, cardinality 1. (sec 12.3) - I can work through that point if that would help.)
>

This refers to my completely made up definition which had to abide by
incorrect definitions of Ω0, identities and compatibility based on
previous notions about relations and misunderstanding the
specification.

>
>  If you are raising a test case against the text of the specification, could you work through the example in [2008Mar/0013] so we can identify the step or steps where you differ from the description in that message.
>

I think we now agree on the results of your example.

Here are 5 extra tests that I'm suggesting to add.  This hopefully
clarifies and makes normative the specification from the test suite.
Hopefully my results are understandable - I haven't used a standard
syntax.

Test 1
=====
This test is to show that in the empty graph pattern (5.2.1) is the
JOIN identity (12.3) as mentioned at the end of 12.3.1.

Data:
 :a :b :c .
 :x :y :z

Query:
SELECT * WHERE { { ?s ?p ?o } . {} }

Results:
( [ ?s = :a, ?p = :b, ?o = :c ],
  [ ?s = :x, ?p = :y, ?o = :z]
)

Test 2
=====
This is to show that UNION of a graph and the empty graph pattern
returns all the elements in the graph plus one extra result (μ0).

Data:
 :a :b :c .
 :x :y :z

Query:
SELECT * WHERE { { ?s ?p ?o } UNION {} }

Results:
( [ ?s = :a, ?p = :b, ?o = :c ],
  [ ?s = :x, ?p = :y, ?o = :z],
  []
)

Test 3
=====
This is to show that UNION is a multiset UNION and two empty graph
patterns gives two extra results (two μ0).

Data:
 :a :b :c .
 :x :y :z

Query:
SELECT * WHERE { { ?s ?p ?o } UNION {} UNION {} }

Results:
( [ ?s = :a, ?p = :b, ?o = :c ],
  [ ?s = :x, ?p = :y, ?o = :z],
  [],
  []
)

Test 4
=====
This is to show that compatibility in SPARQL is universal
quantification (12.3 "Compatible Mappings") and that two disjoint
domains results in a cross product.

Data:
 :a :b :c .
 :x :y :z

Query:
SELECT * WHERE { { ?s ?p ?o } . { ?a ?b ?c } }

Results:
( [ ?s = :a, ?p = :b, ?o = :c, ?a = :a, ?b = :b, ?c = :c ],
  [ ?s = :a, ?p = :b, ?o = :c, ?a = :x, ?b = :y, ?c = :z ],
  [ ?s = :x, ?p = :y, ?o = :z, ?a = :a, ?b = :b, ?c = :c],
  [ ?s = :x, ?p = :y, ?o = :z, ?a = :x, ?b = :y, ?c = :z]
)

Test 5
=====
Negative syntax test.

SELECT ?x WHERE { ?s ?p ?o }
Received on Tuesday, 25 March 2008 18:00:36 UTC