- From: Axel Polleres <axel.polleres@deri.org>
- Date: Tue, 16 Feb 2010 14:51:31 +0000
- To: SPARQL Working Group <public-rdf-dawg@w3.org>
browsing through Emanuele's proposal... Please forgive that I just quickly wrote this up without a lot of
in-depth thinking yet... just to kick-off discussion...
Firstly, I have some open questions on the proposal we might want to ask them...
1)
If I get it right, AGGREGATE {var FUNCTION vars )
i) projects first groups wrt variables appearing in vars and then
ii) evaluates the aggregate on the those groups ...
That may make sense for count, but how does that work for
min/max, i.e. where is the projection ?
hmmm... actually it seems the grammar given for FUNCTION is wrong...
it should be
Function | Function '(' var ')'
or
Function | Function '(' vars ')'
we may want to ask back for clarification here...
... ok, but let's assume that this means that they have just
SELECT ... var
where
AGGREGATE ( var function vars ) FILTER filter
=more-or-less=
SELECT function as var
where
GROUP BY vars
HAVING filter
with a slightly different implicit grouping than we have at the moment?
There claimed advantage seems to be that they allow to do different aggregations *at once*
which seems to have some merits, since we can probably only do this with some cumbersome subqueries at the moment.
2) I don't get entirely get their examples though... e.g.
SELECT ?name ?surname ?book ?numberOfBooks ?averageNumberOfBooks
WHERE {
?auth :name ?name .
?auth :surname ?surname .
?auth :wrote ?book .
?auth :affiliated ?organization .
}
AGGREGATE { (?numberOfBooks, COUNT, {?auth} ) FILTER (?numberOfBooks > 5) }
AGGREGATE { (?affiliationBooks, SUM(?numberOfBooks), {?organization} )
FILTER (?affiliationBooks > 50)}
here, ?affiliationBooks violates the constraint they pose before:
"In the C-SPARQL language all the variables used in AGGREGATE clauses
must appear also in the SELECT clause, since aggregation happens after
standard SPARQL query evaluation. In SPARQL 1.1 the constraint is not
specified."
I assume they probably meant to say the constraint only to apply to the variables
mentioned in the function and group part? we may want to ask that back for clarification as well
Also, I'd like to see the result table for this.
3) It seems that their SPARQL1.1 formulation attempt of this one
has some errors...
SELECT ?topic ?numberOfSwissAuthors ?numberOfItalianAuthors
WHERE {
?auth :name ?name .
?auth :wrote ?book .
?book :topic ?topic .
?auth :hasNationality ?nat .
}
AGGREGATE { FILTER(?nat = 'IT') (?numberOfItalianAuthors, COUNT,
{?topic} ) }
AGGREGATE { FILTER(?nat = 'CH') (?numberOfSwissAuthors, COUNT, {?topic}
) FILTER(?numberOfItalianAuthors>?numberOfSwissAuthors)}
However, in total, I think their aggregation proposal could have some merits,
seemingly to allow aggregation with less subqueries necessary, at
least this seems to be the main point of argumentation. I am not yet
convinced about their argument that they can express all of our
aggregation/grouping without an actual proof.
Axel
Received on Tuesday, 16 February 2010 14:52:06 UTC