- From: Hassan Aït-Kaci <hak@ilog.com>
- Date: Tue, 17 Oct 2006 12:54:46 +0200
- To: public-rif-wg@w3.org
- Message-ID: <4534B676.5050804@ilog.com>
Hello, This closes my action (87). Please refer to enclosed documents. -hak -- Hassan Aït-Kaci ILOG, Inc. - Product Division R&D tel/fax: +1 (604) 930-5603 - email: hak @ ilog . com
ACTION-87: http://www.w3.org/2005/rules/wg/track/actions/87
ON: Hassan Aït-Kaci
SUBJECT: write response to Christian's proposal making clarifications
where necessary (see REF.)
REF.: http://lists.w3.org/Archives/Public/public-rif-wg/2006Jul/0048.html
------------------------------------------------------------------------------
In an email of Monday, 31 July, 2006, CSMA proposed that a RIF/RAF
classification scheme for rules should give "variables" and their scopes
a "fair" treatment. Namely, in addition to providing classification
discriminants related to what constitute the two sides of a rule (i.e.,
the LHS and RHS), the RIF should also specify the nature and scope of
variables that appear in a rule.
I basically agree with CSMA on the essence of the idea. I would like to
add some thoughts to his regarding what an overall rule classification
sheme for RIF (phase 1 or 2). Being aware that this WG has yet to reach
a consensus on several issues regarding rule classification, I would
like to emphasize three important points that I consider essential for
any approach to achieve a RIF.
Here are the three points.
------------------------------------------------------------------------
A. THE RIF SHOULD BE FORMULATED AT AN ADEQUATE LEVEL OF ABSTRACTION;
B. THE RIF MUST CAREFULLY IDENTIFY A CLASSIFICATION VOCABULARY THAT DOES NOT
PRESUME OF ITS INTENDED USE;
C. THE RIF SHOULD ONLY HAVE ABSTRACT SYNTAX: IT SHOULD NOT BE CONCERNED
WITH CONCRETE SYNTAX.
------------------------------------------------------------------------
Next, I elaborate on these points, explicating their need and justifying
their importance towards achieving an effective RIF. Finally, I sketch a
(very rough) possible classification scheme that is based on these
observations - this is meant only as illustration, not as a formal
proposal.
---------------------------------------------------------------------
A. THE RIF SHOULD BE FORMULATED AT AN ADEQUATE LEVEL OF ABSTRACTION
The core kind of rules that the RIF will be used to represent has
already been recognized (whether for Phase I or II). Indeed, the RIF
Charter has made the choice to focus first on Horn Rules, then extend
the scheme to Event-Condition-Actions (ECA) Rules in order to for it
to encompass Production Rules and Reactive Rules. Although this is a
sensible approach, I personally think that "Horn Rules" is too
complex a concept making several undue assumptions.
For starters, it forces a *specific data model* (viz., Herbrand
terms) where such a data model should be *abstract* at this
level. While it can be easily agreed that, indeed, Horn Rules make up
a substantial bulk of rulesets that would be interchanged through the
RIF (e.g., all extant Prolog systems), many rule systems do not use
this data model (e.g., production rule systems, reactive rule
systems, and even some logical rule systems such as LIFE). I contend
that there is no need to set the level of abtraction so low as to
systematically "build in" Herbrand terms into a rule system whether
relevant or not. Instead, a formal scheme that encompasses, among
others, Horn Rules (i.e., Definite Clauses over Herbrand Terms), but
also other deductive rule systems based on Definite Clauses over data
models.
Such a formal scheme exsists: Constraint Logic Programming (CLP). I
claim that it lies at a better level of abtraction than Horn Rules in
that (1) allows dealing with a wider set of rules, (2) it does not
demand a commitment of any specific data model, while (3) it covers
Horn Rules as a simple instantiation of the constraint component of
the scheme to eqautions over Herbrand Terms.
In fact the main idea behing CLP is to abstract the data model from
the rule scheme using (logical) *variables* as the interface between
the workd or Rules and the world of Data. Please refer to the CLP
scheme's technical summary I am providing as attachment to this for a
better idea.
Thus, this suggests that, rather than a 2-part entity (a RHS and a
LHS) over arguments from a specific datat model, a better rule
representation is a 3-part entity consisting of the two sides
together with a constraint. Viz., the simplest and most general rule
is made of:
L :- R | C.
L: a relational atom over mutually distinct variables;
R: a conjunction of relational atoms, each over its own mutually
distinct set of variables;
C: a constraint: a abstract form involving the variables of the LHS
and the RHS.
Instantiating the nature of the constraints to specific forms yields
specific rule systems over specific data models:
If C is made of only equations over variables and constants, we get
Datalog; if C is made of equations over Herbrand terms, we get Horn
Rules; etc., ...
The advantage of using CLP to express a rule gives a better level
abstraction in that:
1. The semantics of a ruleset is parameterized with that of the
constraint system expressing the data model and is inherited
for free with preciously little that need be assumed by the
data model.
2. This clean separation of concerns between rule and data
allows a modular approach to classifying rule systems.
3. Going beyond Definite Clauses (which deal only with
conjuctions of positive retational atoms) is also possible as
it is for classical Horn Rules, by extending it with
negation, disjunction, and nested quatification (i.e., as was
proposed for the Core RIF Condition Language).
Indeed, abstracting the data model as constraints alleviates the
need for ad-hockerry even for sorted or typed objects which may thus
use the logical reading of constraints as an effective means to
formalize logically non-logical RL features such as graph
pattern-matching.
---------------------------------------------------------------------
B. THE RIF MUST CAREFULLY IDENTIFY A CLASSIFICATION VOCABULARY THAT DOES
NOT PRESUME OF ITS INTENDED USE
Because rules are eventually *used* in a specific intended way, the
choice of words denoting its various components is often misleadingly
presuming of this use, thus preventing alternative interpretations.
For example, "head"/"body", "antecedent/consequent", designating the
two sides of a rule, anticipate a specific semantics which does not
belong speficically to the nature of a rule - e.g., a Datalog rule
may used top-down or bottom-up (or both). Moreover, some choices of
words may carry different meanings for different people (i.e., try
asking several RIF-WG members what a rule's "head" and "body" is for
them).
Another source of potential confuion is the notion of "Variable". In
LP (and CLP as well), the only kind of variable is a "Logical
Variable" (LV). These variables may be unbound or bound. They get
typically bound by a process of constraint solving over a data model.
Thus, in Prolog, this process is unification of Herbrand Terms; in
ILOG's Rule Language (IRL), this process is pattern-matching of typed
attributed object; etc., ... Importantly, the binding of an LV is
always *monotonic* in that it may only take a value that refines
consistently the LV's previous values. In particular, there is no
"destructive" assignment.
However, many extant rules systems involve other kinds of "variables"
which are *not* logical. These are simple "programmatic variables"
(PVs) (i.e., identifiers that carry a value in some domain and that
can be updated non-monotonically - i.e., destructively).
Both LVs and PVs have *scope*. While the scope of an LV is always
restricted to a single rule (all of it, or some subparts), that of a
PV may be local to a rule (or its parts) and global to a rule set.
It was the objective of CSMA's proposal to make this fact explicit by
requiring that a rule should also specify its variables scopes in
addition to its (2-part or 3-part) internal structure. With the
clarifications I have added, I fully share his opinion.
---------------------------------------------------------------------
C. THE RIF SHOULD ONLY HAVE ABSTRACT SYNTAX: IT SHOULD NOT BE CONCERNED
WITH CONCRETE SYNTAX.
Since I started my participation in this WG (i.e., December 2005), I
have noticed a systematic confusion in many discussions by several
members. Namely, some of us are still grappling with the question of
whether the RIF should - or not - be a Just Another Rule Language
(J.A.R.L.). This confusion has surfaced repeatedly either explicitly
or implicitly, as is unfortunately likely to linger until we all
agree.
I personally think that the RIF should not be a J.A.R.L. and this
for the follwing reasons:
1. The RIF, being an Rule *Interchange* Format purporting to support
interoperability among rule languages, is a formalism for
expressing all the essential concepts making up rules and rulesets
that need to be represented. As I tried to explain in my original
quote from Peter Landin's "The Next 700 Programming Languages"
(and as Frank McCabe recently reminded us), the RIF is a language
space in which specific (rule) languages are to be mapped.
2. Such mappings are typically realized by parsing some RL's specific
*concrete* surface syntax into an *abstract* syntax representation
using elements of a RIF ontology. Such an abstract syntax is the
closest one may speak of "syntax" when sepaking of the RIF. Indeed,
by *RIF syntax* one should not mean human-readable syntax, but
some representation thereof based on a consensual vocabulary.
Importantly, such an abstract syntax, contrary to usual concrete
syntax,
(a) is non-linear (i.e., it is tree- or graph-based);
(b) is not human-readable (i.e., it will be XML-based);
(c) has well-defined semantics allowing one or several
operational interpretation;
(d) must be consensual (unlike concrete syntaxes that
can me mapped into a RIF-compliant AST).
Thus, any debate regarding a specific concrete human-readable
syntax (such as e.g., the RCL or HRL proposed by the REWERSE
folks) is moot at best. The only issues that matter are the
representation vocabulary to be used by RIF and its specification
as an XML-based ontology. The onus of parsing/translating a
concrete syntax to its RIF form is on the client Rule Language,
not the RIF. The RIF is and must me a well-defined *target* AST
representation *formalism* (rather than language) not meant for
human consumption.
3. Finally, the (on-going) debate about the nature of symbols used
for the RIF constructs (the name of elements, attributes, whether
they should be simple identifiers, URI, IRI, or whatever, is
(IMHO) pointless and technically vacuous. Basically, whatever the
lexical *signature* of the ontology that will eventually emerge as
that of the RIF is irrelevant to the main issue of specifying a
correct and sufficiently complete vocabulary. (It is not even
syntax, it is morphology.)
This being said, it is important that we, in the RIF WG, do not
reinvent ontological constructs that have been proposed in other
related scientific venues that have aimed at defining some XML
vocabularies classifying some forms of rules (i.e., RuleML,
REWERSE, PRR, etc., ...). Still, the choice of words used for the
RIF is "up to isomorphism" of the signature of symbols used for
it. These lexical considerations are, at this point, trivial and
obfuscate the central issue of classifying rule languages.
---------------------------------------------------------------------
APPENDIX -
Here is a rough sketch of (the beginning of) a RL classification scheme
loosely based on the lines I discuss above... ( {Foo} means "set of
Foo")
RuleKind
|
|
+----------------+-------------+---- ...
| | |
| | |
Definite Production Rewrite ...
|
|
+---------+---------+
| |
| |
BusinessRules ECA-Rules
Variable
(name : QName,
scope : Scope)
|
|
+------------+-------------+
| |
| |
ProgramVariable LogicalVariable
Scope
|
|
+----------------+----------------------------+
| | |
| | |
RuleSet | Expression
(kind : RuleKind) |
Rule(ruleset : RuleSet,
| name : QName,
| vars : {Variable},
| lhs : Expression,
| rhs : Expression)
|
|
+---------------------+-----------------------+
| |
| |
| ProductionRule
CLPClause(ruleset: RuleSet(kind:Definite),
| vars : {LogicalVariable(scope = self.ruleset)},
| ... )
|
DefiniteClause(ruleset : RuleSet(kind:Horn),
vars : {LogicalVariable(scope:self.ruleset)},
lhs : Atom,
rhs : {Atoms},
... )
Attachments
- application/pdf attachment: clp-summary.pdf
Received on Tuesday, 17 October 2006 10:54:36 UTC