- From: Michael Kifer <kifer@cs.sunysb.edu>
- Date: Tue, 14 Oct 2008 14:27:33 -0400
- To: "Changhai Ke" <cke@ilog.fr>
- Cc: "Gary Hallmark" <gary.hallmark@oracle.com>, <public-rif-wg@w3.org>
Right now this is a coordination effort. Aggregates are going to be added to FLD, so I want to make sure that this is compatible with PRD's future addition of these things. michael On Tue, 14 Oct 2008 11:40:56 +0200 "Changhai Ke" <cke@ilog.fr> wrote: > Hello, > > I don't know exactly the context under which the discussion about > aggregates happens. It seems to me that it's too early to introduce > aggregates to PRD, we still have simple things to define before we add > aggregates. In the future, PRD should also add the aggregates, so they > should be reusable. What do you think? > > In general, "groupBy" is an almost required feature for aggregates. It's > a kind of must-have for the Event Stream Processing (ESP, domain of > CEP). Maybe it's also interesting to add constructs like "top N" (order > the collection by a criterion and take the first N elements). > > What do you think about the monoids > (http://en.wikipedia.org/wiki/Monoid) to support the calculus of > aggregates? > > Changhai > > -----Original Message----- > From: public-rif-wg-request@w3.org [mailto:public-rif-wg-request@w3.org] > On Behalf Of Michael Kifer > Sent: lundi 13 octobre 2008 23:34 > To: Gary Hallmark > Cc: public-rif-wg@w3.org > Subject: Re: [PRD/FLD] aggregates > > > > > On Mon, 13 Oct 2008 12:03:34 -0700 > Gary Hallmark <gary.hallmark@oracle.com> wrote: > > > Michael Kifer wrote: > > > On Sun, 12 Oct 2008 17:25:50 -0700 > > > Gary Hallmark <gary.hallmark@oracle.com> wrote: > > > > > > > > >> Hi Michael, > > >> > > >> Thanks for getting the ball rolling. I think this should work for > PRD, > > >> but let me check that I understand your proposal. An example rule > for > > >> computing the average salary of employees grouped by department > would be > > >> something like: > > >> > > >> Forall ?deptno ?sal ?empId ( > > >> AvgDeptSal(?deptno avg(?sal [ ?deptno ] | Emp(?empId ?deptno > ?sal))) > > >> ) > > >> > > > > > > I am not sure how aggregates are supposed to be used in PRD, but: > > > > > > - in a logical rule-based language they would be in the body of a > rule. > > > > > ditto for PRD. Aggregation is always in the condition/body/premise. > > BTW, what do you propose for the model theory? > > Good. > > For the model theory, it can't be exactly the same for PRD and FLD > because in > PRD the aggregates are evaluated on a partial model. If there is > recursion > though aggregation then there would be a significant difference. But it > is > possible, I think, to make the definitions look similar. > > I need to think about some of the details, but the main issue is that > aggregates require a bag semantics, while everything else is defined > using > sets. It is not exactly obvious how to define bag-comprehension (as > opposed to > the usual set-comprehension). If we want to define the bag of all X s.t. > {X| phi(X)} then it is known how to do this if phi is an existentially > quantified conjunction/disjunction. But it is less clear in general what > to do > if phi has foralls. (This is a problem for FLD, not PRD, as I > understand.) > So, some restrictions on phi will probably be required. > > > I think we want the same model theory for PRD and FLD conditions (but > > not rulesets -- PRD has no model theory for rulesets) where they agree > > > on syntax. I suspect this should work fine for aggregation and naf. > > > - the comprehension variable is not quantified > > > (the aggregate works as a kind of quantifier for it) > > > > that sounds right for PRD as well. > > > So, in such a language I would write something like > > > > > > Forall ?depno ?Avgsal ( > > > Query(?depno ?Avgsal) :- > > > ?Avgsal = avg(?sal [?deptno] | Exists ?empId (Emp(?empId ?deptno > ?sal))) > > > ) > > > > > > I suppose this can also be written as a fact like yours: > > > > > > Forall ?deptno ( > > > AvgDeptSal(?deptno avg(?sal [?deptno] | Exists ?empId Emp(?empId > ?deptno ?sal))) > > > ) > > > > > > but I haven't thought about it. > > > > > I first wrote it like yours, then I saw it seemed I could "substitute" > > > and eliminate the ?Avgsal, so I just wanted to check if that was > legal. > > (It would probably be a bit easier for a PRD translator if it was not > > legal, but its not a big deal either way) > > I think it might be easier for PRD to just keep aggregates in the body. > FLD is more general so I need to think about it. > > > >> And if PRD doesn't support group by (I don't know of any PR engines > that > > >> do), we can simulate using > > >> > > >> Deptno(?deptno) :- Emp(?empId ?deptno ?sal) > > >> AvgDeptSal(?deptno ?avgSal) :- And( Deptno(?deptno) ?avgSal = > avg(?sal | > > >> Emp(?empId ?deptno ?sal))) > > >> > > > > > > Something like that. But you do not need to simulate anything. You > just do not > > > include the groupby variables in PRD. The syntax that I proposed is > for FLD and > > > the dialects that will extend BLD in the future. This is not even in > BLD (or > > > core). > > > > I meant to say that I like the notion of groupby and I regret that my > PR > > engine doesn't support it directly because it is used quite often (via > > > the "simulation"). > > > I see. This is not an issue for RIF then. Why don't you propose that IBM > adds this to its production rule engine? (Which soon might be JRules :-) > > > michael > > >
Received on Tuesday, 14 October 2008 18:28:54 UTC