# Re: Review of Query document (Groups, aggregates and subquery)

```>> > > 10 Aggregates
This section should mention somewhere that aggregates must be aliased in
order to project them. A brief mention with a link to 15.1.2 SELECT
expressions would be sufficient.

>> > >    10.1 Aggregate Example
The introductory text seems a bit thin for readers that may not already
be familiar with aggregates. Similarly, the example in 10.1 might be
aided by some explanatory text detailing how the final result is arrived at.

>> > >    10.2 Algebra Operators
"Aggregation, a function which calculates a scalar value as an output of
the aggregate expression in the SELECT clause." -- aggregate expressions
can also be in the HAVING clause, right (not just SELECT)?

The 'scalar' argument to Aggregation() is said to be a set, but in the
example Aggregation() is called with a value of 0.

The second argument to the call to the aggregate set function 'func' is
defined as card[range(g)] - card[M], but the use of this value isn't
discussed until later in 10.2.2, and then only vaguely. It is never used
in the definitions of the set functions.

>> > >        10.2.1 HAVING
>> > >        10.2.2 Set Functions
Just before the definition of Sum, the example says the result should be
"6.0 (decimal)", but it should be "6.0 (float)".

In the definition of GroupConcat, "unicode codepoint 32" is probably
better described as "unicode codepoint U+0020".

GroupConcat(S, scalar) is defined in terms of fn:string-join, but that
function is never defined or referenced again. The fn prefix is defined
in section 1.2.1, but since this function is never discussed in the
document, it should probably be hyperlinked to the xpath definition at
<http://www.w3.org/2005/xpath-functions/#string-join>.

>> > > Material to move to formal definitions:
>> > >        10.2.3 Mapping from Abstract Syntax to Algebra
The example says the SUM expression "becomes Aggregation((?a), (?val),
Sum, (), BGP(?a rdf:value ?val))." This form of Aggregation() uses one
more argument than the definition in 10.2 (the GROUP BY variable). I
would have expected the SUM expression to become

Aggregation((?val), Sum, (), Group((?a), BGP(?a rdf:value ?val)))

In the "Joining Aggregate Values" section, I have no idea what the
introductory sentence is meant to convey. I may have misunderstood the
definition of AggregateJoin, but it seems like it will produce a
multiset of single-mapping sets, {agg_i -> range(A_i)}. I would have
expected something like:

AggregateJoin(A) = { { (agg_i -> range(A_i)) | dom(A_i) = k } | k in
set-union(dom(A)) }

The algorithmic sketch of using AggregateJoin in 17.2.3 might be more
intuitive if there were more than one 'Let A_i' line (more than one
aggregate operation in the query).

>> > > 11 Subqueries
There needs to be introductory text for subqueries.

"It is an error to reuse variable names both inside and outside a
subquery when the variable is not projected from the subquery." -- I
checked with Andy, and he said of this: "It's wrong and (for composition
reasons) we decided otherwise."
```

Received on Wednesday, 22 September 2010 14:32:32 UTC