- From: Hassan Aït-Kaci <hak@ilog.com>
- Date: Thu, 04 May 2006 03:41:08 -0700
- To: public-rif-wg@w3.org
- Message-ID: <4459DA44.7010103@ilog.com>
Hello,
This is written in reaction to my being "invited" by the RIF co-chair
CSMA to share some thoughts on the "Extensible Design" design document
submitted to the RIF WG (see
http://lists.w3.org/Archives/Public/public-rif-wg/2006Apr/0068.html).
As far as I understand it, this document proposes a sketch of a process
for defining a means to achieve interchange (and interoperability as a
consequence) between diverse rule-based idioms. It recognizes that a RIF
language should be span a family of languages sharing a substantial
amount of syntactic and semantic concepts, and provide a means to be
extensible.
The assumption made by this makes sense to me insofar as rule-based
systems officially abiding by a RIF standard would at least guarantee
that the parsers of their own idiosyncratic concrete syntax build
Abstract Syntax Trees (ASTs) serializable in XML form using the
published XML schemas defining the RIF standard. Such RIF-standardized
serialization could then be used as the accepted canonical form of
XML-representation for ASTs of rule-based programs or expressions,
easily digestible by any RIF-abiding systems that would have the means
of interpreting the constructs.
For example, imagine that System A is, say, Alain Colmerauer's Prolog-IV
and System B is SICSTUS Prolog. The two idioms have very different
concrete syntaxes but share a substantial part of their semantics
(Herbrand terms, unification, Prolog's depth-first left-right
resolution, and even some constraint solving such as alldiff etc.,
...). Let us pretend that both System A and System B are officially
RIF-abiding. Then they should be able to spit out an XML serialization
of their parsed program's ASTs using faithfully the RIF standard of
annotation. Then, it is a simple matter for anyone using System A to
swallow RIF-serialized System B and do something with it, or part of it.
The same could be said for System A being ILOG's JRules, say, and System
B being Fair Isaac's Blaze Advisor.
I basically like the main lines of the Boley et al.'s approach for the
following reasons.
1. It starts from a relatively modest and arguably consensual (at least
within this RIF WG) - that of a representation for rule's conditions.
It makes sense to start with this and proceed incrementally from
there to a more complete collection of languages.
2. It uses a formal linguistic approach - which is the natural (and
right) thing to do if we purport to describe families of formal
languages.
3. It can lead to a natural language classification scheme (such as the
one proposed in the RIF-RAF - see
http://www.w3.org/2005/rules/wg/wiki/Rulesystem_Arrangement_Framework).
Such a classification scheme can be formalized as using Formal
Concept Analysis (see below).
4. It offers a incremental layered process for extending whatever we can
successfully represent.
Anyways, after reading the proposal draft, I used the proposed Rule
Condition Language BNF and built a quick Java application using Jacc
(Just Another Compiler Compiler), a tool that I implemented at SFU and
ILOG (it is now ILOG's property). Jacc has the pleasing feature to
allow automatic XML serialization from the AST (besides generating a
working parser and hyperlinked HTML documentation) based on a yacc-like
grammar. This shows that RIF-abiding languages implementing their
parsers in Jacc could inherit automatically the XML-serialized
RIF-representation. To give an idea, I produced a full parser for the
baby Rule Condition Language (RCL) that simply generated its
XML-serialization. See attached RCLDoc.zip file (unzip and open file
doc.html) for details. Comments are welcome. The exercise shows that
one can come a long way for free with formal grammars ... :-)
In conclusion, I think that an RCL-like proposal (and extensions) is a
viable concrete way to proceed for defining a RIF modulo agreeing on a
standard XML schema (à la RuleML, OWL, RDF ...).
-hak
Appendix:
Ganter/Wille's Formal Concept Analysis
I still marvel at FCA's basic idea's simplicity, elegance, and
effectiveness. This methodology ought to be more widely known and
used for automatic ontology extraction. For instance, for the W3C
Semantic Web, ontology representation languages (such as OWL, etc,
...) must all start with some form of well-defined ontolgy before
being of any use. The RIF WG's objective is to define a Rule
Interchange Format. Whatever "rule", "interchange", and "format"
mean is yet to be finalized for there to be a clear consensus,
especially in such a large Working Group. One problem we are facing
today is how to classify the collection of rule systems known and
advocated by the WG members into forming an ontology of the systems
along a set of features and attributes along several dimensions -
semantic, syntatic, pragmatic, etc... ! See for example the "Swiss
Knife" example in [Bernhard Ganter and Rudolf Wille, "Conceptual
Scaling", in Fred Roberts, Ed., Applications of Combinatorics and
Graph Theory to the Biological and Social Sciences, pp. 139--167,
Springer Verlag, 1989]. See also the PDF attachments for a simpler
example.
Thus, as the FCA Conceptual Scaling (CS) method preconizes, we could
follow a bottom up approach for deriving a Rule System Ontology.
Starting with a set of objects (the rule systems) obtained from from
all the WG member describing their own known systems, we could derive
the set of relevant attributes per dimension (i.e., the union of all
those attributes described for each system for this dimension).
Then, thanks to CS, it would be a simple matter to form the boolean
(or perhaps even similariry measure) matrix (or hyper-matrix for
higher dimensions) classifying Systems vs. Features from which we may
automatically obtain a faithful and conservative ontology of rule
systems where the lattice of classes if formed by union and
intersection of their attributes.
Note that the RIF WG preferred the a top down approach where the set
of attributes is potentially quite large and irrelevant to most
actual systems. Such an ontology has a harder time emerging since it
requires a global view of all systems for anyone to decide what the
relevant dimensions and attributes are a priori. Further, this often
leads to confusion of attributes dimension (e.g., semantic vs
syntactic vs pragmatic) or irrelevant attributes.
Be that as it may, there are systems usable in a friendly interactive
graphical environment for users to develop and visualize
multi-dimensional ontologies. I have not used any, but I, for one,
would be most interested in using one (see, e.g., the Toscana System
based on the Ganter/Wille method using Formal Concept Analysis for
building and visualizing concept lattices from attributed objects:
http://gdea.informatik.uni-koeln.de/archive/00000166/).
I suggest that we get our inspiration from the Ganter/Wille
methodology starting from the RIF-RAF classification scheme to derive
an adequate RIF representation along the line of what is proposed in
http://lists.w3.org/Archives/Public/public-rif-wg/2006Apr/0068.html.
--
Hassan Aït-Kaci
ILOG, Inc. - Product Division R&D
tel/fax: +1 (604) 930-5603 - email: hak @ ilog . com
Attachments
- application/x-zip-compressed attachment: RCLDoc.zip
- application/pdf attachment: LogicalScaling.pdf
Received on Thursday, 4 May 2006 10:40:05 UTC