- From: Vincent, Paul D <PaulVincent@fairisaac.com>
- Date: Thu, 8 Jun 2006 09:48:14 -0500
- To: "Alex Kozlenkov" <alex.kozlenkov@betfair.com>, <public-rif-wg@w3.org>
Alex: may I try and redefine your use case rules in more of a pseudo rule language? > rcvMsg(XID,Protocol,FromIP,inform,heartbeat(Role,RemoteTime)) :- > time(LocalTime) > update(key(FromIP,Role),heartbeats(FromIP,Role, > RemoteTime,LocalTime)). R1: On event rcvMsg do checkTime and update heartbeats. This is an ECA rule but there are no conditions, just event and actions. > eca( time( every('1S') ), > event( detect(controller_failure(IP,Role,'1S')) ), > action( respond(controller_failure(IP,Role,'1S')) ) ). > detect(controller_failure(IP,Role,Timeout)) :- > time(LocalTimeNow), > heartbeats(IP,Role,RemoteTime,LocalTime), > LocalTimeNow-LocalTime > Timeout. > respond(controller_failure(IP,Role,Timeout)) :- > time(LocalTime), > first(holdsAt(status(Server,unloaded),LocalTime)), > update(key(Server),happens(loading(Server),LocalTime)), > sendMsg(XID,loopback,self,initiate,failover(Role,IP,Server)). R2: On event "every second" if detect controller_failure then respond to controller failure. R3: If current time - current heartbeat time then detect controller_failure R4: If respond to controller_failure then ?? update server status ?? [I couldn't follow this - my Prolog is possibly too rusty...] Observations: A. From my square peg perspective, this looks like a square hole (ie I confess I see everything as production rules where possible). This may indeed be a use case (the first?) for "rule system type" interchange, as clearly the my pseudo-rules are more ECA / PR. Or have I missed some behaviour that is NOT suited for PR? B. If my interpretation is correct (and this is a use of a Prolog-type system to model system behaviour, and my mapping is correct) then there are clearly some challenges ahead for "rule system type" conversions, particularly around the runtime invocation interfaces to the rules. For example, does the rule language include an event generator? Etc etc. [Rule system type translation, in some cases, may turn out to be a subset of "software language" translation, which reminds me of the OMG Knowledge Discovery Metamodel - see http://adm.omg.org/ ] Paul Vincent for Fair Isaac Blaze Advisor -- Business Rule Management System @ OMG and W3C standards for rules > -----Original Message----- > From: public-rif-wg-request@w3.org [mailto:public-rif-wg-request@w3.org] > On Behalf Of Alex Kozlenkov > Sent: Thursday, June 08, 2006 1:14 PM > To: public-rif-wg@w3.org > Subject: Outline of a Betfair use case > > > Importantly, there is a combination of a push logic and a pull logic > involved. I'll make the UC clearer and post on WIKI in a formal format. > > A Manager node is responsible for holding housekeeping information about > various servers playing different roles. When a server fails to send a > heartbeat for a specified amount of time, the Manager assumes that the > server failed and cooperates with the Agent component running on an > unloaded node to resurrect it. A typical rule for receiving and updating > the latest heartbeat in event notification style would look like this: > rcvMsg(XID,Protocol,FromIP,inform,heartbeat(Role,RemoteTime)) :- > time(LocalTime) > update(key(FromIP,Role),heartbeats(FromIP,Role, > RemoteTime,LocalTime)). > .The rule responds to a message pattern matching the one specified in > the rcvMsg arguments. XID is the correlation-id of the incoming message; > inform is called a performative representing the semantic type of the > mes-sage, in this case, a one-way information passed between parties; > heart-beat(...) is the payload of the message. The body of the rule > enquires about the current local time and updates the record containing > the latest heartbeat from the controller. This rule follows a push > pattern where the event is pushed towards the rule systems and the > latter reacts. A pull-based ECA rule that is activated every second by > the rule engine and for each server that fails to have sent heartbeats > within the last second will detect server failures and respond to it by > initiating failover to the first available unloaded server. The > accompanying derivation rules detect and respond are used for specific > purpose of detecting the failure and organising the response. > eca( time( every('1S') ), > event( detect(controller_failure(IP,Role,'1S')) ), > action( respond(controller_failure(IP,Role,'1S')) ) ). > detect(controller_failure(IP,Role,Timeout)) :- > time(LocalTimeNow), > heartbeats(IP,Role,RemoteTime,LocalTime), > LocalTimeNow-LocalTime > Timeout. > respond(controller_failure(IP,Role,Timeout)) :- > time(LocalTime), > first(holdsAt(status(Server,unloaded),LocalTime)), > update(key(Server),happens(loading(Server),LocalTime)), > > sendMsg(XID,loopback,self,initiate,failover(Role,IP,Server)). > The ECA logic involves possible backtracking so that all failed > compo-nents will be resurrected. The state of each server is managed via > an event calculus formulation: > initiates(loading(Server),status(Server,loaded),T). > terminates(unloading(Server),status(Server,loaded),T). > initiates(unloading(Server),status(Server,unloaded),T). > terminates(loading(Server),status(Server, loaded),T). > The actual state of each server is derived from the happened loading and > unloading events and used in the ECA rule to detect the first server > which is in state "unloaded". This EC based formalization can be easily > ex-tended, e.g. with new states such as a maintenance state which > terminates an unloaded state, but is not allowed in case a server is > already loaded: > initiates(maintaining(Server),status(Server,maintenance),T):- > not(holdsAt(status(Server,loaded),T)). > terminates(maintaining(Server),status(Server,unloaded),T). > Due to space restrictions we can not show further extensions. However, > as it can be already seen from the initial examples further, > higher-level deci-sion logics, such as SLA contract rules, defining > quality of service poli-cies, e.g. average availability levels and > penalty payments in case these service levels can not be met, might be > easily build upon this basic set of failover handling rules using > further ECA, EC and event notification rules. > _______________________________________________ > This email and any files transmitted with it are confidential, proprietary and intended solely for the individual or entity to whom they are addressed. If you have received this email in error please delete it immediately.
Received on Thursday, 8 June 2006 14:49:26 UTC