Re: summary on options for JP-4 Comment about the semantics of property paths from Andy Seaborne on 2012-02-11 (public-rdf-dawg@w3.org from January to March 2012)

From: Andy Seaborne <andy.seaborne@epimorphics.com>
Date: Sat, 11 Feb 2012 18:44:17 +0000
To: public-rdf-dawg@w3.org
Message-ID: <4F36B701.5060905@epimorphics.com>
On 11/02/12 18:14, Axel Polleres wrote:
>
> On 11 Feb 2012, at 18:31, Gregory Williams wrote:
>
>> On Feb 11, 2012, at 8:26 AM, Axel Polleres wrote:
>>
>>> I see some merits for option 2, particularly, because I am unclear about how optimizations can be defined in general
>>> for DISTINCT at arbitrary places, whereas it seems to be clear for DISTINCT around path expressions.
>>> So, if we know that now, why not fix it now?
>>
>> Have we settled that wrapping a property path pattern in a distinct subquery satisfy's Jorge's desired existential semantics?
>
> That is the point... I don't know how this can be done, thus my doubts about Option 1:
> I asked for help on that in one of my earlier mails... Quoting:
>
>   >    it seems that (looking at their results in Section 7.1) that this is orthogonally possible with our current>semantics
>   >    by just wrapping any TriplesSameSubjectPath containing a property path into a DISTINCT subquery.
>   >    It seems their result in section 7.1 indicates something along these lines, but I need some help there:
>   >    admittedly don't have a formal proof for this equivalence yet (to be cautious, I am not yet 100% clear how/whether there is
>   >    any interference possible with bnodes within TriplesSameSubjectPath and duplicates coming from those bnodes)
>
> To make this more tangible, what I was thinking is the following:
>
> Take
>
>   Q:  SELECT ?S WHERE { ?s :p/:q [] }
>
> and the following graph
>
>    :s1 :p [ :q :o1 ].
>    :s1 :p [ :q :o1 ].
>    :s1 :p [ :q :o2 ].
>
> Now, query Q obviously returns duplicate results...
>
> -------
> | s   |
> =======
> | :s1 |
> | :s1 |
> | :s1 |
> -------
>
> ... the first two due to path counting for :o1 and the third one for :o2.
>
> If I just wrap  each BGP with a path into a DISTINCT subquery mechanically, something like:
>
>    SELECT ?s WHERE { SELECT DISTINCT ?s { ?s :p:/:q [] } }
>
> I get
>
> -------
> | s   |
> =======
> | :s1 |
> -------
>
> right?
>
> But, if I take the JP-4 semantics, I get
>
> -------
> | s   |
> =======
> | :s1 |
> | :s1 |
> -------
>
> dont I?

I think it's not helping by talking about bnodes because of the 
introduces an implicit projection as well.  We could rewrite them named 
variables, do the projects/distinct as needed then project the renamed 
bNodes out of the results as a query transformation.  that way, you can 
do whatever you want without a forced projection.

SELECT DISTINCT ?S WHERE { ?s :p/:q ?O }
-------
| S   |
=======
| :s1 |
-------

SELECT DISTINCT ?S ?O WHERE { ?s :p/:q ?O }
--------------
| S   | O    |
==============
| :s1 | :o2  |
| :s1 | :o1  |
--------------

SELECT ?S WHERE { { SELECT DISTINCT ?S ?O { ?S :p/:q ?O } }
-------
| S   |
=======
| :s1 |
| :s1 |
-------

> So, the problem is that without DISTINCT() around the path, I have no "handle" to
> provide an equivalent rewriting to their semantics, it seems, at least, not straight forward...
>
>
>> We've provided reasons for why we think the counting semantics are important to have, and it seems as if they can be turned into the existential semantics with subqueries. This seems very much like the limit-per-resource use case to me; I'm not thrilled by the syntax required to do it, but it's possible now, and systems for which performance is critical can optimize for this case.
>> Further, systems can experiment with deciding on how best to implement the existential semantics directly into the syntax, choosing between the many options that Andy lays out in his email, and that would provide important data for some future WG to consider.
>
> ... alright, if someone has an idea how to emulate their semantics in our existing syntax, I'd feel better about Option1
> At the moment, I am simply unsure how to tackle option1.

They make a proposal and is one possible semantics, not the only 
possibility.

XPath takes it's own, different, view where nodes and atomic values have 
different cardinality rules.

http://www.w3.org/TR/xpath20/#id-path-expressions
   Bullet 1 => nodes are unique
   Bullet 2 => atomic values are not made unique
   Bullet 3 => you can't mix.

Option 1 controls cardinality - it is not promising to be exactly the 
same as option 2.1 or the proposal in the paper.  It gives choices.

	Andy

>
> cheers,
> Axel
>
>> .greg
>>
>>
>
>
Received on Saturday, 11 February 2012 18:44:43 UTC