# Re: Proposed definition of ExprMultiSet

From: Andy Seaborne <andy.seaborne@talis.com>
Date: Mon, 08 Mar 2010 15:57:32 +0000
Message-ID: <4B951E6C.1030402@talis.com>
To: Steve Harris <steve.harris@garlik.com>
CC: SPARQL Working Group <public-rdf-dawg@w3.org>
```

On 08/03/2010 2:20 PM, Steve Harris wrote:
> On 7 Mar 2010, at 22:57, Andy Seaborne wrote:
>
>> Overall - we seem to have the start of a possible design and so this
>>
>> On 07/03/2010 9:33 PM, Steve Harris wrote:
>>> On 7 Mar 2010, at 17:42, Andy Seaborne wrote:
>>>
>>>> ISSUE-53
>>>>
>>>> I propose the following to define ExprMultiSet:
>>>>
>>>> -------
>>>>
>>>> Let Ω be a partition.
>>>>
>>>> ExprMultiSet(Ω) =
>>>> { eval(expr,μ) | μ in Ω such that eval(μ(expr)) is defined }
>>>> UNION
>>>> { e | μ in Ω such that eval(μ(expr)) is undefined }
>>>>
>>>> where "e" is some symbol that is distinct from all RDF terms.
>>>>
>>>> card[x]:
>>>> if DISTINCT:
>>>> card[x] = 1 if there exists μ in Ω such that x = eval(μ(expr))
>>>> card[x] = 0 otherwise
>>>> else
>>>> card[x] = count of μ in Ω such that x = eval(μ(expr))
>>>
>>> I find the reuse of the term ExprMultiset as a function very confusing,
>>> but I think I understand the proposal.
>>
>> It's just trying to write the ExprMultiset based on Ω for which there
>> is no notation. I suppose it should involve μ. It only about whether
>> you like to write definitions with free terms or not.
>>
>> "ExprMultiset based on Ω, expr = ... "
>
> I believe that was handled in the definition of Aggregation()
> previously, but possibly there's some term missing.
>

ExprMultiset appears first in:

[[
Aggregation(GroupClause, ExprMultiset, func, Ω) =

{ merge(k, func( { μ'(exp) | exp in ExprMultiset, μ' in Ω' } ) | (k,
Ω') in Partition(GroupClause, Ω) }
]]

but also

[[
If this keyword is present then any
duplicate values in exp · μ' are removed, effectively making
ExprMultiset a set.
]]

It seems to be used both as a set of expressions, and also as the
results after evaluation.

I'm giving a name+definition to the thing that is the outcome of
evaluating the expression over a multiset of solutions.  This is then
used on a partition.

(which reminds me - there is only one expression in an aggregate in the
stardard ones but it's not an impossible to think of n-ary ones e.g.
min-distance(point1, point2))

Some definitions:
rough wording - needs refining
only copes with one expression
does not deal with aggregator parameters

--------
Defn: ExprValueMultiSet

An ExprValueMultiSetis the multi set formed by evaluating the an
expression for each solution bind of a multiset of solutions.

ExprValueMultiSet of expr and Ω =
{ eval(expr,μ) | μ in Ω such that eval(μ(expr)) is defined }
UNION
{ e | μ in Ω such that eval(μ(expr)) is undefined }

where "e" is some symbol that is distinct from all RDF terms.

card[x]:
if DISTINCT:
card[x] = 1 if there exists μ in Ω such that x = eval(μ(expr))
card[x] = 0 otherwise
else
card[x] = count of μ in Ω such that x = eval(μ(expr))

--------

Defn: Aggregation

An aggregation is the multiset of solutions forms by aggregating the
solutions in a partition, for each partition in a group, together with
the key for the group:

Aggregation(GroupClause, ExprValueMultiSet, func, var, Ω) =

{ merge(k, (var, func( EVMS ) )
| EVMS is the ExprValueMultiSet of (expr, Ω' )
(k, Ω') in Partition(GroupClause, Ω) }

card[x] = 1 if x in Aggregation else 0
# The key is different in each row so there will be no duplicates.

--------

I added "var" as the variable being bound to the aggregate value but
also we have situations where it is no variable name given.

Maybe a definition that just defines the value is better and leave the
binding and merging the place the aggregate is used.  That's also not
straight forward to align the values for each partition with the keys.

Andy
```
Received on Monday, 8 March 2010 15:58:05 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:01:02 UTC