Re: Comments about the semantics of property paths from jorge perez on 2011-02-10 (public-rdf-dawg-comments@w3.org from February 2011)

From: jorge perez <jorge.perez.rojas@gmail.com>
Date: Wed, 9 Feb 2011 22:19:35 -0300
To: Andy Seaborne <andy.seaborne@epimorphics.com>
Cc: public-rdf-dawg-comments@w3.org
Message-ID: <AANLkTimNePrvVBO6-Mr7D82K6q=sraK6V2iTEw5g=iXX@mail.gmail.com>

Thank you Andy, your email answers my question. I am not very
comfortable with the need of using a subquery + DISTINCT to obtain the
data that I wanted to obtain, but since considering multiple paths is
a design decision in SPARQL 1.1 I am OK with that. I would advice the
group to include that example (subquery + DISTINCT) in the
specification, and also include the following example

SELECT SUM(?A)
WHERE
{ :me (:friend)+ ?F
  ?F :age ?A }

which does not give the same answer and is also a possible way that a
first-time user of property paths would try to obtain the data of
friend ages (instead of using a sub query). Both examples would be a
perfect way of making users aware of the way paths are being evaluated
in SPARQL.

Best regards,
- jorge

On Sun, Feb 6, 2011 at 5:55 PM, Andy Seaborne
<andy.seaborne@epimorphics.com> wrote:
> Jorge,
>
> You asked:
>
>> Aggregation is actually another reason of why multiple paths to the
>> same endpoint *do not* have to be considered.
>>
>> Consider a network of friends, and assume that you want to obtain the
>> SUM of the age of all your network (friends of your friends). Then a
>> very natural way to do this is with the query (simplified syntax)
>>
>> SUM (?A)
>> :me (:friend)+/:age ?A
>>
>> The query is navigating to all the friends of my friends, then to the
>> age value of every one, and then taking the SUM. Isn't this natural?
>> But, consider the following data
>>
>> :me :friend :f1
>> :me :friend :f2
>> :f1 :friend :f2
>> :f1 :age 20
>> :f2 :age 20
>>
>> I would expect 40 as the result of the above query, but the expression
>>
>> :me (:fiend)+/:age ?A
>>
>> returns
>>
>> ?A
>> 20 (for the path :me->:f1)
>> 20 (for the path :me->:f2)
>> 20 (for the path :me->:f1->:f2)
>>
>> and thus, the answer of the SUM would be 60. How do you explain the
>> result of this query to a user? Notice that using DISTINCT does not
>> solve the problem, since with DISTINCT you would obtain 20 as the SUM
>> which is also wrong.
>>
>> Is there a way to correctly answer the above query with the current
>> design of property paths?
>
> One way to query to get the sum of ages is:
>
> SELECT SUM(?A)
> WHERE
> { ?F :age ?A
>  { SELECT DISTINCT ?F
>   WHERE
>    { :me (:friend)+ ?F } }}
>
> This puts the uniqueness on the friends, then combines it with the ages and
> calculates the SUM. There is a split in the property path because your query
> requires distinctness on one part but not another.
>
>
> Consider the following simplified purchase order, which includes two units
> of :item1, by different paths (part of :compound and directly as a entry on
> the purchase order).
>
> Data:
>
> @prefix : <http://example/> .
>
> :order :contains :thing1 .
> :order :contains :compound1 .
>
> :thing1 :unitOf :item1 .
> :thing2 :unitOf :item2 .
> :thing3 :unitOf :item1 .
>
> :item2 :price 2 .
> :item1 :price 2 .
>
> :compound1 :contains :thing2 .
> :compound1 :contains :thing3 .
>
> Query:
>
> PREFIX : <http://example/>
>
> SELECT (SUM(?itemPrice) AS ?price)
> {
>  :order :contains+/:unitOf/:price ?itemPrice .
> }
>
> This returns 6 for ?price. Making the path match with DISTINCT would results
> in 2. Here, all the prices are the same but we wish to retain duplicates as
> they relate to different parts of the :order.
>
> We would be grateful if you would acknowledge that your comment has been
> answered by sending a reply to this mailing list.
>
> Andy
> On behalf of the SPARQL working group.
>

Received on Thursday, 10 February 2011 01:20:08 UTC