- From: Jeen Broekstra <jeen.broekstra@gmail.com>
- Date: Fri, 22 Apr 2011 17:18:34 +1200
- To: public-sparql-dev@w3.org
- CC: Ruslan Velkov <ruslan@sirma.bg>
Hi all, (Cc Ruslan),
Ruslan and I are currently working on conformance testing for Sesame's
implementation of SPARQL 1.1 query, and there is a case where I am not
100% sure what the expected behavior is. This case involves a query that
uses a GROUP_CONCAT aggregate, a grouping, and an order by solution
modifier. Before I start pestering the SPARQL working group I'd like to
hear some other developers' thoughts.
All SPARQL aggregate operators are defined to work on multisets. This
means that by default, the input of an aggregate does not have any
prescribed order. For most aggregates this is irrelevant anyway, but for
GROUP_CONCAT it does make a difference. Consider the following example data:
:org1 :affiliates :p1, :p2 .
:org2 :affiliates :p3, :p4 .
:p1 :name "John" .
:p2 :name "Paul" .
:p3 :name "Ringo" .
:p4 :name "George" .
I want to produce a query that gives me concatenated names per
organisation. Each concatenated string should have the names in
alphabetical order.
My initial thought was that this query would do the trick:
SELECT ?org (GROUP_CONCAT(?name) as ?names)
WHERE {?org :affiliates ?p. ?p :name ?name }
GROUP BY ?org
ORDER BY ASC(str(?name))
Expected result:
?org ?names
--------------
:org1 "John Paul"
:org2 "George Ringo"
However, looking at the SPARQL 1.1 query spec, I think this is not a
guaranteed result: as far as I can tell the solution modifier ORDER BY
should be applied to the solution sequence _after_ grouping and
aggregation, so it can not influence the order of the input for the
GROUP_CONCAT. This would mean that for the above query, the result could
equally well be:
?org ?names
--------------
:org1 "Paul John"
:org2 "George Ringo"
or indeed any other permutation of name concatenations.
I have thought about using some subquery to solve the problem, but since
SPARQL defines the input of an aggregate operator explicitly as a _set_,
I am not even sure that would work: as far as I can tell a SPARQL engine
has no obligation to preserve input order when evaluating aggregate
operators.
Two questions:
1. is the above correct?
2. is there any other way in SPARQL 1.1 to enforce ordering on a
GROUP_CONCAT?
In relation to question 2, I note that in MySQL, the standard SQL
group_concat operator (on which, I assume, the SPARQL operator has been
based) has been extended to include an ordering clause as an argument to
the group_concat function itself. See
http://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html#function_group-concat.
Chocolate egg for your thoughts,
Jeen
Received on Friday, 22 April 2011 05:19:07 UTC