- From: Seaborne, Andy <andy.seaborne@hp.com>
- Date: Mon, 6 Jul 2009 22:18:00 +0000
- To: Eric Prud'hommeaux <eric@w3.org>, "public-rdf-dawg@w3.org" <public-rdf-dawg@w3.org>
> -----Original Message-----
> From: public-rdf-dawg-request@w3.org [mailto:public-rdf-dawg-request@w3.org]
> On Behalf Of Eric Prud'hommeaux
> Sent: 06 July 2009 21:05
> To: public-rdf-dawg@w3.org
> Subject: MINUS vs. UNSAID
>
> I've been noodling on this for a while, trying to see practical
> differences between MINUS (evaluated purely bottom-up) and UNSAID
> (which appears to seed the outermost BGP with variable bindings from
> outside).
>
> I implemented UNSAID probably 6 years ago in Algae2, and I complained
> bitterly that the bottom-up semantics of SPARQL were confusing for
> users. Despite that, I have to say that MINUS is much more clear to me
> as UNSAID carries variable bindings into nested evaluations in ways I
> don't fully understand.
UNSAID (NOT EXISTS below) is a filter.
PREFIX : <http://example/>
SELECT *
WHERE
{ ?a :p1 ?b
NOT EXISTS
{ ?a :p2 ?c .
?c :p3 ?b
}
}
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
(base <file:///c:/home/afs/Projects/ARQ/>
(prefix ((: <http://example/>))
(filter (! (exists
(bgp
(triple ?a :p2 ?c)
(triple ?c :p3 ?b)
)))
(bgp (triple ?a :p1 ?b)))))
So it is bottom up evaluation as usual but it's not based on a join/leftjoin.
It's
{ ?a :p1 ?b FILTER (give ?a and ?b does <expr> evaluate to true?....) }
But without the mobility of FILTER.
In this case the same query is:
{ ?a :p1 ?b FILTER(NOT EXISTS{ ?a :p2 ?c . ?c :p3 ?b}) }
Because it is a the end of the first BGP anyway.
>
> Following are some thought experiments (doing subtraction by hand):
>
> Data D1: { :a1 :p1 :b1 ;
> :p2 :c1 .
> :c1 :p3 :b2 }
>
The significance point Eric is making in this query is the use of ?b in the first triple pattern and also as object for :p3.
> Query M1: { ?a :p1 ?b MINUS { ?a :p2 ?c . ?c :p3 ?b } }
> ?a ?b - ?a ?b ?c = ?a ?b
> a1 b1 a1 b2 c1 a1 b1
> Result M1: ?a ?b
> a1 b1
Try: ?c as a bNode (the effect is to project it out of the way)
?a ?b - ?a ?b = <nothing>
a1 b1 a1 b2
which seems odd to me. Why does having _:c and not ?c make a logical difference?
>
> Query M2: { ?a :p1 ?b MINUS { ?a :p2 ?c OPTIONAL { ?c :p3 ?b } } }
> ?a ?b - ?a ?b ?c = ?a ?b
> a1 b1 a1 b2 c1 a1 b1
> Result M2: ?a ?b
> a1 b1
>
>
> AndyS performed these on ARQ's UNSAID implementation:
>
> Query U1: { ?a :p1 ?b UNSAID { ?a :p2 ?c . ?c :p3 ?b } }
> Result U1: ?a ?b
> a1 b1
>
> Query U2: { ?a :p1 ?b UNSAID { ?a :p2 ?c OPTIONAL { ?c :p3 ?b } } }
> Result U2: ?a ?b # no bindings
>
> AndyS, could you tell me what you get when you try Query U2 on:?
> Data D2: { :a1 :p1 :b1 ;
> :p2 :c1 } # no :p3 arc
> I expect the same as U2.
Yes.
>
> Looking for real-world questions which would elicit these differences:
>
> Data D3
> _:eric :givenname "eric"; :holdsAccout <act1>. <act1> :accountName "eric".
> _:bobS :givenname "bob" ; :holdsAccout <act2>. <act2> :accountName "bobs".
> _:eve :givenname "eve" ; :holdsAccout <act3>.
> (
> or in tabular form for brevity:
> givenname holdsAccount accountName
> _:eric "eric" <act1> "eric"
> _:bobS "bob" <act2> "bobs"
> _:eve "eve" <act3>
> )
>
> Find the folks for whom I need to create accounts based on their given name:
>
> Query M3:
> ?who foaf:givenname ?name
> MINUS {
> ?who foaf:holdsAccount ?act
> OPTIONAL {
> ?act foaf:accountName ?name .
> }
> }
> Result M3:
> ?who ?name ?who ?act ?name ?who ?name
> _:eric "eric" _:eric <act1> "eric" _:bobS "bob"
> _:bobS "bob" - _:bobS <act2> "bobS" = _:eve "eve"
> _:eve "eve" _:eve <act3>
>
>
> Query U3:
> ?who foaf:givenname ?name
> MINUS {
> ?who foaf:holdsAccount ?act
> OPTIONAL {
> ?act foaf:accountName ?name .
> }
> }
>
> Result U3 (extrapolated from UNSAID results above):
> ?who ?nam
> _:eve "eve"
>
>
>
> Note, these are untested, please verify.
> --
> -eric
>
> office: +1.617.258.5741 32-G528, MIT, Cambridge, MA 02144 USA
> mobile: +1.617.599.3509
>
> (eric@w3.org)
> Feel free to forward this message to any list for any purpose other than
> email address distribution.
>
> There are subtle nuances encoded in font variation and clever layout
> which can only be seen by printing this message on high-clay paper.
This email is a reply to the on-screen version.
Andy
Received on Monday, 6 July 2009 22:18:26 UTC