Re: Review of Query document (Groups, aggregates and subquery)

>> > > 10 Aggregates
This section should mention somewhere that aggregates must be aliased in 
order to project them. A brief mention with a link to 15.1.2 SELECT 
expressions would be sufficient.

>> > >    10.1 Aggregate Example
The introductory text seems a bit thin for readers that may not already 
be familiar with aggregates. Similarly, the example in 10.1 might be 
aided by some explanatory text detailing how the final result is arrived at.

>> > >    10.2 Algebra Operators
"Aggregation, a function which calculates a scalar value as an output of 
the aggregate expression in the SELECT clause." -- aggregate expressions 
can also be in the HAVING clause, right (not just SELECT)?

The 'scalar' argument to Aggregation() is said to be a set, but in the 
example Aggregation() is called with a value of 0.

The second argument to the call to the aggregate set function 'func' is 
defined as card[range(g)] - card[M], but the use of this value isn't 
discussed until later in 10.2.2, and then only vaguely. It is never used 
in the definitions of the set functions.

>> > >        10.2.1 HAVING
>> > >        10.2.2 Set Functions
Just before the definition of Sum, the example says the result should be 
"6.0 (decimal)", but it should be "6.0 (float)".

In the definition of GroupConcat, "unicode codepoint 32" is probably 
better described as "unicode codepoint U+0020".

GroupConcat(S, scalar) is defined in terms of fn:string-join, but that 
function is never defined or referenced again. The fn prefix is defined 
in section 1.2.1, but since this function is never discussed in the 
document, it should probably be hyperlinked to the xpath definition at 

>> > > Material to move to formal definitions:
>> > >        10.2.3 Mapping from Abstract Syntax to Algebra
The example says the SUM expression "becomes Aggregation((?a), (?val), 
Sum, (), BGP(?a rdf:value ?val))." This form of Aggregation() uses one 
more argument than the definition in 10.2 (the GROUP BY variable). I 
would have expected the SUM expression to become

Aggregation((?val), Sum, (), Group((?a), BGP(?a rdf:value ?val)))

In the "Joining Aggregate Values" section, I have no idea what the 
introductory sentence is meant to convey. I may have misunderstood the 
definition of AggregateJoin, but it seems like it will produce a 
multiset of single-mapping sets, {agg_i -> range(A_i)}. I would have 
expected something like:

AggregateJoin(A) = { { (agg_i -> range(A_i)) | dom(A_i) = k } | k in 
set-union(dom(A)) }

The algorithmic sketch of using AggregateJoin in 17.2.3 might be more 
intuitive if there were more than one 'Let A_i' line (more than one 
aggregate operation in the query).

>> > > 11 Subqueries
There needs to be introductory text for subqueries.

"It is an error to reuse variable names both inside and outside a 
subquery when the variable is not projected from the subquery." -- I 
checked with Andy, and he said of this: "It's wrong and (for composition 
reasons) we decided otherwise."

Received on Wednesday, 22 September 2010 14:32:32 UTC