- From: Andy Seaborne <andy.seaborne@talis.com>
- Date: Mon, 08 Mar 2010 15:57:32 +0000
- To: Steve Harris <steve.harris@garlik.com>
- CC: SPARQL Working Group <public-rdf-dawg@w3.org>
On 08/03/2010 2:20 PM, Steve Harris wrote: > On 7 Mar 2010, at 22:57, Andy Seaborne wrote: > >> Overall - we seem to have the start of a possible design and so this >> message is about details. >> >> On 07/03/2010 9:33 PM, Steve Harris wrote: >>> On 7 Mar 2010, at 17:42, Andy Seaborne wrote: >>> >>>> ISSUE-53 >>>> >>>> I propose the following to define ExprMultiSet: >>>> >>>> ------- >>>> >>>> Let Ω be a partition. >>>> >>>> ExprMultiSet(Ω) = >>>> { eval(expr,μ) | μ in Ω such that eval(μ(expr)) is defined } >>>> UNION >>>> { e | μ in Ω such that eval(μ(expr)) is undefined } >>>> >>>> where "e" is some symbol that is distinct from all RDF terms. >>>> >>>> card[x]: >>>> if DISTINCT: >>>> card[x] = 1 if there exists μ in Ω such that x = eval(μ(expr)) >>>> card[x] = 0 otherwise >>>> else >>>> card[x] = count of μ in Ω such that x = eval(μ(expr)) >>> >>> I find the reuse of the term ExprMultiset as a function very confusing, >>> but I think I understand the proposal. >> >> It's just trying to write the ExprMultiset based on Ω for which there >> is no notation. I suppose it should involve μ. It only about whether >> you like to write definitions with free terms or not. >> >> "ExprMultiset based on Ω, expr = ... " > > I believe that was handled in the definition of Aggregation() > previously, but possibly there's some term missing. > ExprMultiset appears first in: [[ Aggregation(GroupClause, ExprMultiset, func, Ω) = { merge(k, func( { μ'(exp) | exp in ExprMultiset, μ' in Ω' } ) | (k, Ω') in Partition(GroupClause, Ω) } ]] but also [[ If this keyword is present then any duplicate values in exp · μ' are removed, effectively making ExprMultiset a set. ]] It seems to be used both as a set of expressions, and also as the results after evaluation. I'm giving a name+definition to the thing that is the outcome of evaluating the expression over a multiset of solutions. This is then used on a partition. (which reminds me - there is only one expression in an aggregate in the stardard ones but it's not an impossible to think of n-ary ones e.g. min-distance(point1, point2)) Some definitions: rough wording - needs refining only copes with one expression does not deal with aggregator parameters -------- Defn: ExprValueMultiSet An ExprValueMultiSetis the multi set formed by evaluating the an expression for each solution bind of a multiset of solutions. ExprValueMultiSet of expr and Ω = { eval(expr,μ) | μ in Ω such that eval(μ(expr)) is defined } UNION { e | μ in Ω such that eval(μ(expr)) is undefined } where "e" is some symbol that is distinct from all RDF terms. card[x]: if DISTINCT: card[x] = 1 if there exists μ in Ω such that x = eval(μ(expr)) card[x] = 0 otherwise else card[x] = count of μ in Ω such that x = eval(μ(expr)) -------- Defn: Aggregation An aggregation is the multiset of solutions forms by aggregating the solutions in a partition, for each partition in a group, together with the key for the group: Aggregation(GroupClause, ExprValueMultiSet, func, var, Ω) = { merge(k, (var, func( EVMS ) ) | EVMS is the ExprValueMultiSet of (expr, Ω' ) (k, Ω') in Partition(GroupClause, Ω) } card[x] = 1 if x in Aggregation else 0 # The key is different in each row so there will be no duplicates. -------- I added "var" as the variable being bound to the aggregate value but also we have situations where it is no variable name given. Maybe a definition that just defines the value is better and leave the binding and merging the place the aggregate is used. That's also not straight forward to align the values for each partition with the keys. Andy
Received on Monday, 8 March 2010 15:58:05 UTC