- From: Hassan Aït-Kaci <hak@ilog.com>
- Date: Thu, 04 May 2006 03:41:08 -0700
- To: public-rif-wg@w3.org
- Message-ID: <4459DA44.7010103@ilog.com>
Hello, This is written in reaction to my being "invited" by the RIF co-chair CSMA to share some thoughts on the "Extensible Design" design document submitted to the RIF WG (see http://lists.w3.org/Archives/Public/public-rif-wg/2006Apr/0068.html). As far as I understand it, this document proposes a sketch of a process for defining a means to achieve interchange (and interoperability as a consequence) between diverse rule-based idioms. It recognizes that a RIF language should be span a family of languages sharing a substantial amount of syntactic and semantic concepts, and provide a means to be extensible. The assumption made by this makes sense to me insofar as rule-based systems officially abiding by a RIF standard would at least guarantee that the parsers of their own idiosyncratic concrete syntax build Abstract Syntax Trees (ASTs) serializable in XML form using the published XML schemas defining the RIF standard. Such RIF-standardized serialization could then be used as the accepted canonical form of XML-representation for ASTs of rule-based programs or expressions, easily digestible by any RIF-abiding systems that would have the means of interpreting the constructs. For example, imagine that System A is, say, Alain Colmerauer's Prolog-IV and System B is SICSTUS Prolog. The two idioms have very different concrete syntaxes but share a substantial part of their semantics (Herbrand terms, unification, Prolog's depth-first left-right resolution, and even some constraint solving such as alldiff etc., ...). Let us pretend that both System A and System B are officially RIF-abiding. Then they should be able to spit out an XML serialization of their parsed program's ASTs using faithfully the RIF standard of annotation. Then, it is a simple matter for anyone using System A to swallow RIF-serialized System B and do something with it, or part of it. The same could be said for System A being ILOG's JRules, say, and System B being Fair Isaac's Blaze Advisor. I basically like the main lines of the Boley et al.'s approach for the following reasons. 1. It starts from a relatively modest and arguably consensual (at least within this RIF WG) - that of a representation for rule's conditions. It makes sense to start with this and proceed incrementally from there to a more complete collection of languages. 2. It uses a formal linguistic approach - which is the natural (and right) thing to do if we purport to describe families of formal languages. 3. It can lead to a natural language classification scheme (such as the one proposed in the RIF-RAF - see http://www.w3.org/2005/rules/wg/wiki/Rulesystem_Arrangement_Framework). Such a classification scheme can be formalized as using Formal Concept Analysis (see below). 4. It offers a incremental layered process for extending whatever we can successfully represent. Anyways, after reading the proposal draft, I used the proposed Rule Condition Language BNF and built a quick Java application using Jacc (Just Another Compiler Compiler), a tool that I implemented at SFU and ILOG (it is now ILOG's property). Jacc has the pleasing feature to allow automatic XML serialization from the AST (besides generating a working parser and hyperlinked HTML documentation) based on a yacc-like grammar. This shows that RIF-abiding languages implementing their parsers in Jacc could inherit automatically the XML-serialized RIF-representation. To give an idea, I produced a full parser for the baby Rule Condition Language (RCL) that simply generated its XML-serialization. See attached RCLDoc.zip file (unzip and open file doc.html) for details. Comments are welcome. The exercise shows that one can come a long way for free with formal grammars ... :-) In conclusion, I think that an RCL-like proposal (and extensions) is a viable concrete way to proceed for defining a RIF modulo agreeing on a standard XML schema (à la RuleML, OWL, RDF ...). -hak Appendix: Ganter/Wille's Formal Concept Analysis I still marvel at FCA's basic idea's simplicity, elegance, and effectiveness. This methodology ought to be more widely known and used for automatic ontology extraction. For instance, for the W3C Semantic Web, ontology representation languages (such as OWL, etc, ...) must all start with some form of well-defined ontolgy before being of any use. The RIF WG's objective is to define a Rule Interchange Format. Whatever "rule", "interchange", and "format" mean is yet to be finalized for there to be a clear consensus, especially in such a large Working Group. One problem we are facing today is how to classify the collection of rule systems known and advocated by the WG members into forming an ontology of the systems along a set of features and attributes along several dimensions - semantic, syntatic, pragmatic, etc... ! See for example the "Swiss Knife" example in [Bernhard Ganter and Rudolf Wille, "Conceptual Scaling", in Fred Roberts, Ed., Applications of Combinatorics and Graph Theory to the Biological and Social Sciences, pp. 139--167, Springer Verlag, 1989]. See also the PDF attachments for a simpler example. Thus, as the FCA Conceptual Scaling (CS) method preconizes, we could follow a bottom up approach for deriving a Rule System Ontology. Starting with a set of objects (the rule systems) obtained from from all the WG member describing their own known systems, we could derive the set of relevant attributes per dimension (i.e., the union of all those attributes described for each system for this dimension). Then, thanks to CS, it would be a simple matter to form the boolean (or perhaps even similariry measure) matrix (or hyper-matrix for higher dimensions) classifying Systems vs. Features from which we may automatically obtain a faithful and conservative ontology of rule systems where the lattice of classes if formed by union and intersection of their attributes. Note that the RIF WG preferred the a top down approach where the set of attributes is potentially quite large and irrelevant to most actual systems. Such an ontology has a harder time emerging since it requires a global view of all systems for anyone to decide what the relevant dimensions and attributes are a priori. Further, this often leads to confusion of attributes dimension (e.g., semantic vs syntactic vs pragmatic) or irrelevant attributes. Be that as it may, there are systems usable in a friendly interactive graphical environment for users to develop and visualize multi-dimensional ontologies. I have not used any, but I, for one, would be most interested in using one (see, e.g., the Toscana System based on the Ganter/Wille method using Formal Concept Analysis for building and visualizing concept lattices from attributed objects: http://gdea.informatik.uni-koeln.de/archive/00000166/). I suggest that we get our inspiration from the Ganter/Wille methodology starting from the RIF-RAF classification scheme to derive an adequate RIF representation along the line of what is proposed in http://lists.w3.org/Archives/Public/public-rif-wg/2006Apr/0068.html. -- Hassan Aït-Kaci ILOG, Inc. - Product Division R&D tel/fax: +1 (604) 930-5603 - email: hak @ ilog . com
Attachments
- application/x-zip-compressed attachment: RCLDoc.zip
- application/pdf attachment: LogicalScaling.pdf
Received on Thursday, 4 May 2006 10:40:05 UTC