Re: [PRD/FLD] aggregates from Gary Hallmark on 2008-10-13 (public-rif-wg@w3.org from October 2008)

From: Gary Hallmark <gary.hallmark@oracle.com>
Date: Mon, 13 Oct 2008 12:03:34 -0700
To: kifer@cs.sunysb.edu
CC: public-rif-wg@w3.org
Message-ID: <48F39B86.6010408@oracle.com>
Michael Kifer wrote:
> On Sun, 12 Oct 2008 17:25:50 -0700
> Gary Hallmark <gary.hallmark@oracle.com> wrote:
>
>   
>> Hi Michael,
>>
>> Thanks for getting the ball rolling.  I think this should work for PRD, 
>> but let me check that I understand your proposal.  An example rule for 
>> computing the average salary of employees grouped by department would be 
>> something like:
>>
>> Forall ?deptno ?sal ?empId (
>>   AvgDeptSal(?deptno avg(?sal [ ?deptno ] | Emp(?empId ?deptno ?sal)))
>> )
>>     
>
> I am not sure how aggregates are supposed to be used in PRD, but:
>
>   - in a logical rule-based language they would be in the body of a rule.
>   
ditto for PRD.  Aggregation is always in the condition/body/premise.  
BTW, what do you propose for the model theory? 
I think we want the same model theory for PRD and FLD conditions (but 
not rulesets -- PRD has no model theory for rulesets) where they agree 
on syntax.  I suspect this should work fine for aggregation and naf.
>   - the comprehension variable is not quantified
>     (the aggregate works as a kind of quantifier for it)
>   
that sounds right for PRD as well.
> So, in such a language I would write something like
>
> Forall ?depno ?Avgsal (
>   Query(?depno ?Avgsal) :-
> 	?Avgsal = avg(?sal [?deptno] | Exists ?empId (Emp(?empId ?deptno ?sal)))
> )
>
> I suppose this can also be written as a fact like yours:
>
> Forall ?deptno (
>    AvgDeptSal(?deptno avg(?sal [?deptno] | Exists ?empId Emp(?empId ?deptno ?sal)))
> )
>
> but I haven't thought about it.
>   
I first wrote it like yours, then I saw it seemed I could "substitute" 
and eliminate the ?Avgsal, so I just wanted to check if that was legal. 
(It would probably be a bit easier for a PRD translator if it was not 
legal, but its not a big deal either way)
>
>
>   
>> And if PRD doesn't support group by (I don't know of any PR engines that 
>> do), we can simulate using
>>
>> Deptno(?deptno) :- Emp(?empId ?deptno ?sal)
>> AvgDeptSal(?deptno ?avgSal) :- And( Deptno(?deptno) ?avgSal = avg(?sal | 
>> Emp(?empId ?deptno ?sal)))
>>     
>
> Something like that. But you do not need to simulate anything. You just do not
> include the groupby variables in PRD. The syntax that I proposed is for FLD and
> the dialects that will extend BLD in the future. This is not even in BLD (or
> core).
>   
I meant to say that I like the notion of groupby and I regret that my PR 
engine doesn't support it directly because it is used quite often (via 
the "simulation").
>
>   
>> Are the aggfuns the usual min, max, sum, avg, count?  (BTW, I don't 
>> think count needs a Var). 
>>     
>
> Yes. (For FLD we should allow whatever a future dialect might want to have.)
> Regarding count, you do not need a var but for uniformity you can use a ?.
>
>   
>> Also nice to include list as an aggfun, that just returns a list of var 
>> bindings.  (Of course, we need to add lists)
>>     
>
> Strictly speaking this is unnecessary since we can write
>
>       agg{?V ...| And(query ?V=some-list)}
>
>
> michael
>
>
>   
>> Michael Kifer wrote:
>>     
>>> I sent this message at the end of the last f2f, but it doesn't seem to have been
>>> delivered. Only today I got a reply that it was rejected by the server.
>>> Anyway, here it goes again:
>>>
>>>     Since we did not have time to discuss the aggregates, let's start it by
>>>     email.  Basically, an aggregate is a term that includes a comprehension.
>>>     In addition, there is a need to be able to GROUP BY, as in databases.  (I
>>>     do not know if this latter thing is needed for PRD.)
>>>
>>>     So, the syntax I was thinking about is:
>>>
>>> 	aggfun{Var [ GroupvarList ] | CondFormula}
>>>
>>>     The symbols {,},[,],| here are the actual symbols, not metasymbols.
>>>     Var is the comprehension variable, i.e., {Var | CondFormula}.
>>>     GroupvarList is the list of vars to group by. The entire piece
>>>     "[ GroupvarList ]" is optional.
>>>
>>>     Note that I need the above general form for FLD. For PRD we might need
>>>     something less general. We just need to make sure that the
>>>     syntaxes are compatible.
>>>
>>>
>>> 	    --michael  
>>>
>>>   
>>>       
>>
Received on Monday, 13 October 2008 19:05:34 UTC