- From: Steve Harris <steve.harris@garlik.com>
- Date: Mon, 8 Mar 2010 17:15:12 +0000
- To: Andy Seaborne <andy.seaborne@talis.com>
- Cc: SPARQL Working Group <public-rdf-dawg@w3.org>
On 8 Mar 2010, at 15:57, Andy Seaborne wrote: > On 08/03/2010 2:20 PM, Steve Harris wrote: >> On 7 Mar 2010, at 22:57, Andy Seaborne wrote: >> >>> Overall - we seem to have the start of a possible design and so this >>> message is about details. >>> >>> On 07/03/2010 9:33 PM, Steve Harris wrote: >>>> On 7 Mar 2010, at 17:42, Andy Seaborne wrote: >>>> >>>>> ISSUE-53 >>>>> >>>>> I propose the following to define ExprMultiSet: >>>>> >>>>> ------- >>>>> >>>>> Let Ω be a partition. >>>>> >>>>> ExprMultiSet(Ω) = >>>>> { eval(expr,μ) | μ in Ω such that eval(μ(expr)) is defined } >>>>> UNION >>>>> { e | μ in Ω such that eval(μ(expr)) is undefined } >>>>> >>>>> where "e" is some symbol that is distinct from all RDF terms. >>>>> >>>>> card[x]: >>>>> if DISTINCT: >>>>> card[x] = 1 if there exists μ in Ω such that x = eval(μ(expr)) >>>>> card[x] = 0 otherwise >>>>> else >>>>> card[x] = count of μ in Ω such that x = eval(μ(expr)) >>>> >>>> I find the reuse of the term ExprMultiset as a function very >>>> confusing, >>>> but I think I understand the proposal. >>> >>> It's just trying to write the ExprMultiset based on Ω for which >>> there >>> is no notation. I suppose it should involve μ. It only about >>> whether >>> you like to write definitions with free terms or not. >>> >>> "ExprMultiset based on Ω, expr = ... " >> >> I believe that was handled in the definition of Aggregation() >> previously, but possibly there's some term missing. >> > > ExprMultiset appears first in: > > [[ > Aggregation(GroupClause, ExprMultiset, func, Ω) = > > { merge(k, func( { μ'(exp) | exp in ExprMultiset, μ' in Ω' } ) | > (k, Ω') in Partition(GroupClause, Ω) } > ]] > > but also > > [[ > If this keyword is present then any > duplicate values in exp · μ' are removed, effectively making > ExprMultiset a set. > ]] > > It seems to be used both as a set of expressions, and also as the > results after evaluation. I think rather the comment on the end is just wrong. > I'm giving a name+definition to the thing that is the outcome of > evaluating the expression over a multiset of solutions. This is > then used on a partition. Right, currently that doesn't have a name > (which reminds me - there is only one expression in an aggregate in > the stardard ones but it's not an impossible to think of n-ary ones > e.g. > min-distance(point1, point2)) It's a Multiset currently, hence the name. e.g. MAX(?x, ?y) w.r.t. ? x=1,2 / ?y=3,4 gives MAX({1,3,2,4}). Lee had some good usecases at the F2F for making it a multiset of expressions. - Steve > Some definitions: > rough wording - needs refining > only copes with one expression > does not deal with aggregator parameters > > -------- > Defn: ExprValueMultiSet > > An ExprValueMultiSetis the multi set formed by evaluating the an > expression for each solution bind of a multiset of solutions. > > ExprValueMultiSet of expr and Ω = > { eval(expr,μ) | μ in Ω such that eval(μ(expr)) is defined } > UNION > { e | μ in Ω such that eval(μ(expr)) is undefined } > > where "e" is some symbol that is distinct from all RDF terms. > > card[x]: > if DISTINCT: > card[x] = 1 if there exists μ in Ω such that x = eval(μ(expr)) > card[x] = 0 otherwise > else > card[x] = count of μ in Ω such that x = eval(μ(expr)) > > -------- > > Defn: Aggregation > > An aggregation is the multiset of solutions forms by aggregating the > solutions in a partition, for each partition in a group, together > with the key for the group: > > Aggregation(GroupClause, ExprValueMultiSet, func, var, Ω) = > > { merge(k, (var, func( EVMS ) ) > | EVMS is the ExprValueMultiSet of (expr, Ω' ) > (k, Ω') in Partition(GroupClause, Ω) } > > card[x] = 1 if x in Aggregation else 0 > # The key is different in each row so there will be no duplicates. > > -------- > > I added "var" as the variable being bound to the aggregate value but > also we have situations where it is no variable name given. > > Maybe a definition that just defines the value is better and leave > the binding and merging the place the aggregate is used. That's > also not straight forward to align the values for each partition > with the keys. > > Andy > > > > -- Steve Harris, Garlik Limited 2 Sheen Road, Richmond, TW9 1AE, UK +44 20 8973 2465 http://www.garlik.com/ Registered in England and Wales 535 7233 VAT # 849 0517 11 Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD
Received on Monday, 8 March 2010 17:15:41 UTC