RIF: Outline of a Betfair use case from Vincent, Paul D on 2006-06-08 (public-rif-wg@w3.org from June 2006)

From: Vincent, Paul D <PaulVincent@fairisaac.com>
Date: Thu, 8 Jun 2006 09:48:14 -0500
To: "Alex Kozlenkov" <alex.kozlenkov@betfair.com>, <public-rif-wg@w3.org>
Message-ID: <3FCAC95D3E99F54B98300F372AA1B2616681A7@STPMSGMB01.corp.fairisaac.com>
Alex: may I try and redefine your use case rules in more of a pseudo
rule language?

> rcvMsg(XID,Protocol,FromIP,inform,heartbeat(Role,RemoteTime)) :-
> 	time(LocalTime)
> 	update(key(FromIP,Role),heartbeats(FromIP,Role,
>                                      RemoteTime,LocalTime)).

R1: On event rcvMsg do checkTime and update heartbeats.

This is an ECA rule but there are no conditions, just event and actions.

> eca(	time( every('1S') ),
> 	event( detect(controller_failure(IP,Role,'1S')) ),
> 	action( respond(controller_failure(IP,Role,'1S')) ) ).
> detect(controller_failure(IP,Role,Timeout)) :-
> 	time(LocalTimeNow),
> 	heartbeats(IP,Role,RemoteTime,LocalTime),
> 	LocalTimeNow-LocalTime > Timeout.
> respond(controller_failure(IP,Role,Timeout)) :-
> 	time(LocalTime),
> 	first(holdsAt(status(Server,unloaded),LocalTime)),
> 	update(key(Server),happens(loading(Server),LocalTime)),
> 	sendMsg(XID,loopback,self,initiate,failover(Role,IP,Server)).

R2: On event "every second" if detect controller_failure then respond to
controller failure.

R3: If current time - current heartbeat time then detect
controller_failure

R4: If respond to controller_failure then ?? update server status ?? [I
couldn't follow this - my Prolog is possibly too rusty...]

Observations: 
A. From my square peg perspective, this looks like a square hole (ie I
confess I see everything as production rules where possible). This may
indeed be a use case (the first?) for "rule system type" interchange, as
clearly the my pseudo-rules are more ECA / PR. Or have I missed some
behaviour that is NOT suited for PR?

B. If my interpretation is correct (and this is a use of a Prolog-type
system to model system behaviour, and my mapping is correct) then there
are clearly some challenges ahead for "rule system type" conversions,
particularly around the runtime invocation interfaces to the rules. For
example, does the rule language include an event generator? Etc etc.
[Rule system type translation, in some cases, may turn out to be a
subset of "software language" translation, which reminds me of the OMG
Knowledge Discovery Metamodel - see http://adm.omg.org/ ] 

Paul Vincent
for Fair Isaac Blaze Advisor  -- Business Rule Management System
@ OMG and W3C standards for rules

> -----Original Message-----
> From: public-rif-wg-request@w3.org
[mailto:public-rif-wg-request@w3.org]
> On Behalf Of Alex Kozlenkov
> Sent: Thursday, June 08, 2006 1:14 PM
> To: public-rif-wg@w3.org
> Subject: Outline of a Betfair use case
> 
> 
> Importantly, there is a combination of a push logic and a pull logic
> involved. I'll make the UC clearer and post on WIKI in a formal
format.
> 
> A Manager node is responsible for holding housekeeping information
about
> various servers playing different roles. When a server fails to send a
> heartbeat for a specified amount of time, the Manager assumes that the
> server failed and cooperates with the Agent component running on an
> unloaded node to resurrect it. A typical rule for receiving and
updating
> the latest heartbeat in event notification style would look like this:
> rcvMsg(XID,Protocol,FromIP,inform,heartbeat(Role,RemoteTime)) :-
> 	time(LocalTime)
> 	update(key(FromIP,Role),heartbeats(FromIP,Role,
>                                      RemoteTime,LocalTime)).
> .The rule responds to a message pattern matching the one specified in
> the rcvMsg arguments. XID is the correlation-id of the incoming
message;
> inform is called a performative representing the semantic type of the
> mes-sage, in this case, a one-way information passed between parties;
> heart-beat(...) is the payload of the message. The body of the rule
> enquires about the current local time and updates the record
containing
> the latest heartbeat from the controller. This rule follows a push
> pattern where the event is pushed towards the rule systems and the
> latter reacts. A pull-based ECA rule that is activated every second by
> the rule engine and for each server that fails to have sent heartbeats
> within the last second will detect server failures and respond to it
by
> initiating failover to the first available unloaded server. The
> accompanying derivation rules detect and respond are used for specific
> purpose of detecting the failure and organising the response.
> eca(	time( every('1S') ),
> 	event( detect(controller_failure(IP,Role,'1S')) ),
> 	action( respond(controller_failure(IP,Role,'1S')) ) ).
> detect(controller_failure(IP,Role,Timeout)) :-
> 	time(LocalTimeNow),
> 	heartbeats(IP,Role,RemoteTime,LocalTime),
> 	LocalTimeNow-LocalTime > Timeout.
> respond(controller_failure(IP,Role,Timeout)) :-
> 	time(LocalTime),
> 	first(holdsAt(status(Server,unloaded),LocalTime)),
> 	update(key(Server),happens(loading(Server),LocalTime)),
> 
> sendMsg(XID,loopback,self,initiate,failover(Role,IP,Server)).
> The ECA logic involves possible backtracking so that all failed
> compo-nents will be resurrected. The state of each server is managed
via
> an event calculus formulation:
> initiates(loading(Server),status(Server,loaded),T).
> terminates(unloading(Server),status(Server,loaded),T).
> initiates(unloading(Server),status(Server,unloaded),T).
> terminates(loading(Server),status(Server, loaded),T).
> The actual state of each server is derived from the happened loading
and
> unloading events and used in the ECA rule to detect the first server
> which is in state "unloaded". This EC based formalization can be
easily
> ex-tended, e.g. with new states such as a maintenance state which
> terminates an unloaded state, but is not allowed in case a server is
> already loaded:
> initiates(maintaining(Server),status(Server,maintenance),T):-
> not(holdsAt(status(Server,loaded),T)).
> terminates(maintaining(Server),status(Server,unloaded),T).
> Due to space restrictions we can not show further extensions. However,
> as it can be already seen from the initial examples further,
> higher-level deci-sion logics, such as SLA contract rules, defining
> quality of service poli-cies, e.g. average availability levels and
> penalty payments in case these service levels can not be met, might be
> easily build upon this basic set of failover handling rules using
> further ECA, EC and event notification rules.
> _______________________________________________
> 

This email and any files transmitted with it are confidential, proprietary
and intended solely for the individual or entity to whom they are addressed.
If you have received this email in error please delete it immediately.
Received on Thursday, 8 June 2006 14:49:26 UTC