- From: Hassan Aït-Kaci <hak@ilog.com>
- Date: Tue, 17 Oct 2006 12:54:46 +0200
- To: public-rif-wg@w3.org
- Message-ID: <4534B676.5050804@ilog.com>
Hello, This closes my action (87). Please refer to enclosed documents. -hak -- Hassan Aït-Kaci ILOG, Inc. - Product Division R&D tel/fax: +1 (604) 930-5603 - email: hak @ ilog . com
ACTION-87: http://www.w3.org/2005/rules/wg/track/actions/87 ON: Hassan Aït-Kaci SUBJECT: write response to Christian's proposal making clarifications where necessary (see REF.) REF.: http://lists.w3.org/Archives/Public/public-rif-wg/2006Jul/0048.html ------------------------------------------------------------------------------ In an email of Monday, 31 July, 2006, CSMA proposed that a RIF/RAF classification scheme for rules should give "variables" and their scopes a "fair" treatment. Namely, in addition to providing classification discriminants related to what constitute the two sides of a rule (i.e., the LHS and RHS), the RIF should also specify the nature and scope of variables that appear in a rule. I basically agree with CSMA on the essence of the idea. I would like to add some thoughts to his regarding what an overall rule classification sheme for RIF (phase 1 or 2). Being aware that this WG has yet to reach a consensus on several issues regarding rule classification, I would like to emphasize three important points that I consider essential for any approach to achieve a RIF. Here are the three points. ------------------------------------------------------------------------ A. THE RIF SHOULD BE FORMULATED AT AN ADEQUATE LEVEL OF ABSTRACTION; B. THE RIF MUST CAREFULLY IDENTIFY A CLASSIFICATION VOCABULARY THAT DOES NOT PRESUME OF ITS INTENDED USE; C. THE RIF SHOULD ONLY HAVE ABSTRACT SYNTAX: IT SHOULD NOT BE CONCERNED WITH CONCRETE SYNTAX. ------------------------------------------------------------------------ Next, I elaborate on these points, explicating their need and justifying their importance towards achieving an effective RIF. Finally, I sketch a (very rough) possible classification scheme that is based on these observations - this is meant only as illustration, not as a formal proposal. --------------------------------------------------------------------- A. THE RIF SHOULD BE FORMULATED AT AN ADEQUATE LEVEL OF ABSTRACTION The core kind of rules that the RIF will be used to represent has already been recognized (whether for Phase I or II). Indeed, the RIF Charter has made the choice to focus first on Horn Rules, then extend the scheme to Event-Condition-Actions (ECA) Rules in order to for it to encompass Production Rules and Reactive Rules. Although this is a sensible approach, I personally think that "Horn Rules" is too complex a concept making several undue assumptions. For starters, it forces a *specific data model* (viz., Herbrand terms) where such a data model should be *abstract* at this level. While it can be easily agreed that, indeed, Horn Rules make up a substantial bulk of rulesets that would be interchanged through the RIF (e.g., all extant Prolog systems), many rule systems do not use this data model (e.g., production rule systems, reactive rule systems, and even some logical rule systems such as LIFE). I contend that there is no need to set the level of abtraction so low as to systematically "build in" Herbrand terms into a rule system whether relevant or not. Instead, a formal scheme that encompasses, among others, Horn Rules (i.e., Definite Clauses over Herbrand Terms), but also other deductive rule systems based on Definite Clauses over data models. Such a formal scheme exsists: Constraint Logic Programming (CLP). I claim that it lies at a better level of abtraction than Horn Rules in that (1) allows dealing with a wider set of rules, (2) it does not demand a commitment of any specific data model, while (3) it covers Horn Rules as a simple instantiation of the constraint component of the scheme to eqautions over Herbrand Terms. In fact the main idea behing CLP is to abstract the data model from the rule scheme using (logical) *variables* as the interface between the workd or Rules and the world of Data. Please refer to the CLP scheme's technical summary I am providing as attachment to this for a better idea. Thus, this suggests that, rather than a 2-part entity (a RHS and a LHS) over arguments from a specific datat model, a better rule representation is a 3-part entity consisting of the two sides together with a constraint. Viz., the simplest and most general rule is made of: L :- R | C. L: a relational atom over mutually distinct variables; R: a conjunction of relational atoms, each over its own mutually distinct set of variables; C: a constraint: a abstract form involving the variables of the LHS and the RHS. Instantiating the nature of the constraints to specific forms yields specific rule systems over specific data models: If C is made of only equations over variables and constants, we get Datalog; if C is made of equations over Herbrand terms, we get Horn Rules; etc., ... The advantage of using CLP to express a rule gives a better level abstraction in that: 1. The semantics of a ruleset is parameterized with that of the constraint system expressing the data model and is inherited for free with preciously little that need be assumed by the data model. 2. This clean separation of concerns between rule and data allows a modular approach to classifying rule systems. 3. Going beyond Definite Clauses (which deal only with conjuctions of positive retational atoms) is also possible as it is for classical Horn Rules, by extending it with negation, disjunction, and nested quatification (i.e., as was proposed for the Core RIF Condition Language). Indeed, abstracting the data model as constraints alleviates the need for ad-hockerry even for sorted or typed objects which may thus use the logical reading of constraints as an effective means to formalize logically non-logical RL features such as graph pattern-matching. --------------------------------------------------------------------- B. THE RIF MUST CAREFULLY IDENTIFY A CLASSIFICATION VOCABULARY THAT DOES NOT PRESUME OF ITS INTENDED USE Because rules are eventually *used* in a specific intended way, the choice of words denoting its various components is often misleadingly presuming of this use, thus preventing alternative interpretations. For example, "head"/"body", "antecedent/consequent", designating the two sides of a rule, anticipate a specific semantics which does not belong speficically to the nature of a rule - e.g., a Datalog rule may used top-down or bottom-up (or both). Moreover, some choices of words may carry different meanings for different people (i.e., try asking several RIF-WG members what a rule's "head" and "body" is for them). Another source of potential confuion is the notion of "Variable". In LP (and CLP as well), the only kind of variable is a "Logical Variable" (LV). These variables may be unbound or bound. They get typically bound by a process of constraint solving over a data model. Thus, in Prolog, this process is unification of Herbrand Terms; in ILOG's Rule Language (IRL), this process is pattern-matching of typed attributed object; etc., ... Importantly, the binding of an LV is always *monotonic* in that it may only take a value that refines consistently the LV's previous values. In particular, there is no "destructive" assignment. However, many extant rules systems involve other kinds of "variables" which are *not* logical. These are simple "programmatic variables" (PVs) (i.e., identifiers that carry a value in some domain and that can be updated non-monotonically - i.e., destructively). Both LVs and PVs have *scope*. While the scope of an LV is always restricted to a single rule (all of it, or some subparts), that of a PV may be local to a rule (or its parts) and global to a rule set. It was the objective of CSMA's proposal to make this fact explicit by requiring that a rule should also specify its variables scopes in addition to its (2-part or 3-part) internal structure. With the clarifications I have added, I fully share his opinion. --------------------------------------------------------------------- C. THE RIF SHOULD ONLY HAVE ABSTRACT SYNTAX: IT SHOULD NOT BE CONCERNED WITH CONCRETE SYNTAX. Since I started my participation in this WG (i.e., December 2005), I have noticed a systematic confusion in many discussions by several members. Namely, some of us are still grappling with the question of whether the RIF should - or not - be a Just Another Rule Language (J.A.R.L.). This confusion has surfaced repeatedly either explicitly or implicitly, as is unfortunately likely to linger until we all agree. I personally think that the RIF should not be a J.A.R.L. and this for the follwing reasons: 1. The RIF, being an Rule *Interchange* Format purporting to support interoperability among rule languages, is a formalism for expressing all the essential concepts making up rules and rulesets that need to be represented. As I tried to explain in my original quote from Peter Landin's "The Next 700 Programming Languages" (and as Frank McCabe recently reminded us), the RIF is a language space in which specific (rule) languages are to be mapped. 2. Such mappings are typically realized by parsing some RL's specific *concrete* surface syntax into an *abstract* syntax representation using elements of a RIF ontology. Such an abstract syntax is the closest one may speak of "syntax" when sepaking of the RIF. Indeed, by *RIF syntax* one should not mean human-readable syntax, but some representation thereof based on a consensual vocabulary. Importantly, such an abstract syntax, contrary to usual concrete syntax, (a) is non-linear (i.e., it is tree- or graph-based); (b) is not human-readable (i.e., it will be XML-based); (c) has well-defined semantics allowing one or several operational interpretation; (d) must be consensual (unlike concrete syntaxes that can me mapped into a RIF-compliant AST). Thus, any debate regarding a specific concrete human-readable syntax (such as e.g., the RCL or HRL proposed by the REWERSE folks) is moot at best. The only issues that matter are the representation vocabulary to be used by RIF and its specification as an XML-based ontology. The onus of parsing/translating a concrete syntax to its RIF form is on the client Rule Language, not the RIF. The RIF is and must me a well-defined *target* AST representation *formalism* (rather than language) not meant for human consumption. 3. Finally, the (on-going) debate about the nature of symbols used for the RIF constructs (the name of elements, attributes, whether they should be simple identifiers, URI, IRI, or whatever, is (IMHO) pointless and technically vacuous. Basically, whatever the lexical *signature* of the ontology that will eventually emerge as that of the RIF is irrelevant to the main issue of specifying a correct and sufficiently complete vocabulary. (It is not even syntax, it is morphology.) This being said, it is important that we, in the RIF WG, do not reinvent ontological constructs that have been proposed in other related scientific venues that have aimed at defining some XML vocabularies classifying some forms of rules (i.e., RuleML, REWERSE, PRR, etc., ...). Still, the choice of words used for the RIF is "up to isomorphism" of the signature of symbols used for it. These lexical considerations are, at this point, trivial and obfuscate the central issue of classifying rule languages. --------------------------------------------------------------------- APPENDIX - Here is a rough sketch of (the beginning of) a RL classification scheme loosely based on the lines I discuss above... ( {Foo} means "set of Foo") RuleKind | | +----------------+-------------+---- ... | | | | | | Definite Production Rewrite ... | | +---------+---------+ | | | | BusinessRules ECA-Rules Variable (name : QName, scope : Scope) | | +------------+-------------+ | | | | ProgramVariable LogicalVariable Scope | | +----------------+----------------------------+ | | | | | | RuleSet | Expression (kind : RuleKind) | Rule(ruleset : RuleSet, | name : QName, | vars : {Variable}, | lhs : Expression, | rhs : Expression) | | +---------------------+-----------------------+ | | | | | ProductionRule CLPClause(ruleset: RuleSet(kind:Definite), | vars : {LogicalVariable(scope = self.ruleset)}, | ... ) | DefiniteClause(ruleset : RuleSet(kind:Horn), vars : {LogicalVariable(scope:self.ruleset)}, lhs : Atom, rhs : {Atoms}, ... )
Attachments
- application/pdf attachment: clp-summary.pdf
Received on Tuesday, 17 October 2006 10:54:36 UTC