W3C home > Mailing lists > Public > public-rdf-dawg-comments@w3.org > March 2010

Re: Feedback on SPARQL 1.1 support for aggregates (was Re: W3C Seeks Feedback on Early Draft of SPARQL 1.1)

From: Axel Polleres <axel.polleres@deri.org>
Date: Wed, 24 Mar 2010 16:49:11 +0000
Cc: <public-rdf-dawg-comments@w3.org>, <dbarbieri@elet.polimi.it>, "Stefano Ceri" <ceri@elet.polimi.it>, "Michael Grossniklaus" <grossniklaus@elet.polimi.it>, "Daniele Maria Braga" <braga@elet.polimi.it>, "Frank van Harmelen" <Frank.van.Harmelen@cs.vu.nl>
Message-Id: <2C021619-F41C-47D8-9BDB-9A92A6248D0A@deri.org>
To: Emanuele Della Valle <emanuele.dellavalle@polimi.it>
More questions on your example queries

SELECT ?name ?surname ?book ?numberOfBooks (AVG(?numberOfBooks) AS
?averageNumberOfBooks)
WHERE {
?auth :hasSurname ?surname .
?auth :hasName ?name .
{
     SELECT ?auth (COUNT(?book) AS ?numberOfBooks)
     WHERE {
             ?auth :wrote ?book .
}
     GROUP BY ?auth
    HAVING (?numberOfBooks > 5)
}

Do you mean here to return the same average number of books for all of the authors having more than 5 books, that means repeat the average for 
every name/surname? Also, it is not clear to me what the ?book variable in the projection shall mean to return (it is not in scope here)

I.e., if you don't mind, could you provide a small dataset and expected result? that would also help us with our testcases :-) )

Next,

SELECT ?name ?surname ?book ?numberOfBooks
WHERE {
   ?auth :hasSurname ?surname .
   ?auth :hasName ?name .
   {
    SELECT ?affiliation (SUM(?numberOfBooks) as ?affiliationBooks)
    WHERE {
     ?auth :affiliated ?organization .
     {
      SELECT ?auth (COUNT(?book) AS ?numberOfBooks)
      WHERE {
       ?auth :wrote ?book .
      }
      GROUP BY ?auth
     HAVING (?numberOfBooks > 5)
    }
    GROUP BY ?organization
   HAVING (?affiliationBooks > 50)
   }

mentions variable ?affiliation which is unbound, I assume you mean ?organisation here.
similar to above, it is not clear to me how the solution set should look like here.
What is the indented result especially for a dataset where one author is affiliated 
with several organisations?

thanks,
Axel

On 18 Feb 2010, at 12:08, Emanuele Della Valle wrote:

> Dear Axel,
> 
> first of all, thanks a lot for the careful look you gave to our comments
> to SPARQL 1.1 support for aggregates.
> 
> Please find hereafter our clarifications in line.
> 
> Axel Polleres ha scritto:
>> further comments...
>> 
>> 
>>> Yes you're right. The correct formulation of the constraint is:
>>> 
>>> In the C-SPARQL language all the variables used in the aggregation
>>> function or in the grouping set of AGGREGATE clauses must appear also in
>>> the SELECT clause, since aggregation happens after standard SPARQL query
>>> evaluation.
>>> 
>> 
>> If I am not mistaken, doesn't work either... counterexample in your quoted example:
>> 
>> 
>>> AGGREGATE { (?affiliationBooks, SUM(?numberOfBooks), {?organization} )
>>> 
>> 
>> 
>> ?organisation is not in the SELECT clause.
>> 
> 
> Of course. Yesterday, we fixed it in
> http://wiki.larkc.eu/c-sparql/sparql11-feedback, but I forgot reporting
> it to you.
> 
>> [...]
>> 
>>> Well, we based the SPARQL 1.1 version on our understanding of SPARQL 1.1.
>>> 
>>> In our understanding, the triple pattern < ?auth :hasName ?name . > is
>>> needed to join the results of the two sub-queries. If we do not include
>>> it the result would be a Cartesian product of the two sub-queries (see
>>> APPENDIX for more details).
>>> 
>> 
>> your query should answer...
>> "For instance, one can ask for the research topics for which
>> the Italian authors are more than the Swiss ones."
>> 
>> but you project the *author* which is neither aggregated nor grouped...
>> that doesn't make sense to me... shouldn't it be simply
>> 
>> 1. SELECT ?topic ?numberOfSwissAuthors ?numberOfItalianAuthors
>> 2. WHERE {
>> 4.     {
>> 5.             SELECT ?topic (COUNT(?book) AS ?numberOfSwissAuthors)
>> 6.             WHERE {
>> 7.                     ?auth :wrote ?book .
>> 8.                     ?book :topic ?topic .
>> 9.                     ?auth :hasNationality ?nat .
>> 10.                     FILTER(?nat = 'CH') .
>> 11.             }
>> 12.             GROUP BY ?topic
>> 13.     }
>> 14.     {
>> 15.             SELECT ?topic (COUNT(?book) AS ?numberOfItalianAuthors)
>> 16.             WHERE {
>> 17.                     ?auth :wrote ?book .
>> 18.                     ?book :topic ?topic .
>> 19.                     ?auth :hasNationality ?nat .
>> 20.                     FILTER(?nat = 'IT') .
>> 21.             }
>> 22.             GROUP BY ?topic
>> 23.     }
>> 24.     FILTER(?numberOfItalianAuthors>?numberOfSwissAuthors)   
>> 25. }
>> 
>> (didn't try that out just now, just a quick shot, but looks more sensible to me...)
>> 
> 
> It looks very meaningful to me as well. I've updated
> http://wiki.larkc.eu/c-sparql/sparql11-feedback accordingly. This,
> actually, solve the issue whether it is possible to express in SPARQL
> 1.1 some of the more expressive C-SPARQL queries ;-)
> 
> Bests,
> 
> Emanuele
> 
Received on Wednesday, 24 March 2010 17:34:08 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 24 March 2010 17:34:09 GMT