W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > January to March 2012

Re: summary on options for JP-4 Comment about the semantics of property paths

From: Axel Polleres <axel.polleres@deri.org>
Date: Sat, 11 Feb 2012 19:14:02 +0100
Cc: "Lee Feigenbaum" <lee@thefigtrees.net>, "SPARQL Working Group" <public-rdf-dawg@w3.org>
Message-Id: <3206DD6E-A31E-410E-B841-591592FF5B2E@deri.org>
To: "Gregory Williams" <greg@evilfunhouse.com>

On 11 Feb 2012, at 18:31, Gregory Williams wrote:

> On Feb 11, 2012, at 8:26 AM, Axel Polleres wrote:
> 
> > I see some merits for option 2, particularly, because I am unclear about how optimizations can be defined in general
> > for DISTINCT at arbitrary places, whereas it seems to be clear for DISTINCT around path expressions.
> > So, if we know that now, why not fix it now?
> 
> Have we settled that wrapping a property path pattern in a distinct subquery satisfy's Jorge's desired existential semantics?

That is the point... I don't know how this can be done, thus my doubts about Option 1:
I asked for help on that in one of my earlier mails... Quoting:

 >   it seems that (looking at their results in Section 7.1) that this is orthogonally possible with our current >semantics
 >   by just wrapping any TriplesSameSubjectPath containing a property path into a DISTINCT subquery.
 >   It seems their result in section 7.1 indicates something along these lines, but I need some help there: 
 >   admittedly don't have a formal proof for this equivalence yet (to be cautious, I am not yet 100% clear how/whether there is
 >   any interference possible with bnodes within TriplesSameSubjectPath and duplicates coming from those bnodes)

To make this more tangible, what I was thinking is the following: 

Take 

 Q:  SELECT ?S WHERE { ?s :p/:q [] }

and the following graph

  :s1 :p [ :q :o1 ].
  :s1 :p [ :q :o1 ].
  :s1 :p [ :q :o2 ].

Now, query Q obviously returns duplicate results... 

-------
| s   |
=======
| :s1 |
| :s1 |
| :s1 |
-------

... the first two due to path counting for :o1 and the third one for :o2.

If I just wrap  each BGP with a path into a DISTINCT subquery mechanically, something like:

  SELECT ?s WHERE { SELECT DISTINCT ?s { ?s :p:/:q [] } }

I get

-------
| s   |
=======
| :s1 |
-------

right?

But, if I take the JP-4 semantics, I get

-------
| s   |
=======
| :s1 |
| :s1 |
-------

dont I?
So, the problem is that without DISTINCT() around the path, I have no "handle" to 
provide an equivalent rewriting to their semantics, it seems, at least, not straight forward... 


> We've provided reasons for why we think the counting semantics are important to have, and it seems as if they can be turned into the existential semantics with subqueries. This seems very much like the limit-per-resource use case to me; I'm not thrilled by the syntax required to do it, but it's possible now, and systems for which performance is critical can optimize for this case.
> Further, systems can experiment with deciding on how best to implement the existential semantics directly into the syntax, choosing between the many options that Andy lays out in his email, and that would provide important data for some future WG to consider.

... alright, if someone has an idea how to emulate their semantics in our existing syntax, I'd feel better about Option1
At the moment, I am simply unsure how to tackle option1.

cheers,
Axel

> .greg
> 
> 
Received on Saturday, 11 February 2012 18:14:33 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:47 GMT