Re: RIF and QL from Ed Barkmeyer on 2006-01-25 (public-rif-wg@w3.org from January 2006)

From: Ed Barkmeyer <edbark@nist.gov>
Date: Wed, 25 Jan 2006 15:11:00 -0500
To: public-rif-wg@w3.org
Message-ID: <43D7DB54.1090906@nist.gov>
François Bry wrote:

> I would like to submit the following views concerning the declarative
> fragment of RIF (ie the following is not about updates, actions,
> (re)active rules, etc. that RIF might have to offer):
> 
> 1. RIF should be a "lingua franca" (literally "French speech", meaning a
> language used as common or commercial tongue among peoples of diverse
> speech).

Agreed.
[Aside: For the record, "lingua franca" was originally a "pidgin" commercial 
language of the Mediterranean basin in the 13th-17th centuries.  It meant 
"language of the Franks", where "Franks" was (in 1300) a general term for 
Western Europeans  (most of whom were, or were ruled by, descendants of the 
German "Frankish" tribes that overran France, Spain and Italy in the 5th-7th 
centuries).  I understand that it was simplified medieval Italian, with words 
from German, Provençal, Catalan and Arabic.  Regrettably, François, not 
French, or at least not the langue d'oil.]

> 2. The RIF lingua franca should abstract out "the diverse speech" by
> dis-considering (a) idiosyncracies, (b) computational choices, as well
> as (c) some semantic choices of query languages and rule
> languages/engines. These three points is motivated below.

I agree in principle, but I think the details of this are debatable.

> First, let draw a line between "rule format" and "rule language", both
> being formal languages (ie something else than human/natural
> languages):
> 
> - A rule format has a (unambigous) syntax and a declarative semantics
> (preferably in the form of a model theory) but does not necessarily
> have a computational semantics.
> 
> - A rule language also has a (unambiguous) syntax and a declarative 
> semantics but also a computational semantics (and preferably a
> prototype implementing this computational semantics).

1.  I agree that it is important to distinguish these ideas by distinct terms. 
  I have a different understanding of "rule format", but I am willing to use 
François terms if everyone agrees.  The important thing is that we distinguish 
between "declarative" and "computational" models.

2.  I think there is a 3rd distinct idea that we also need a term for.  It is 
the one I would have called "rule format":
  - A "rule xxx" is an unambiguous syntax for the expression of rules.
That is, this is just the syntax, to which one or model theories must be 
applied to get a declarative "rule format" (to use François' term).  If 
François has taken "format", then we need another word for "xxx".

3.  I don't think it is "preferable" to have a prototype as the definition of 
the computational semantics, even though that has been the traditional 
experience, e.g. with Prolog and Jess.  It is possible to provide a formal 
definition of "computational semantics" at the level needed to describe 
"procedural", "stratification" and "meta-rule" behaviors, namely the explicit 
computational behavior of each such element.  At the same time, the "effective 
semantics" of a set of these computational elements applied to a given ruleset 
is not specified, and may in some cases be determinable only by 
simulation/implementation.

> Second, let us recall the difference between declarative and
> computational semantics. The difference is that the former might
> ignore computational aspects such as an evaluation order for
> conjunctions and disjunctions, ensuring termination through memoing,
> and optimizations.

Agreed.

> Third, let discuss the three points (a), (b), and (c):
> 
> (a) Idiosyncracies (like writing "\+" instead of "not" in Prolog) should
> obviously be dis-considered for otherwise we shall end up with a
> dictionary of notations. Such a dictionary might be useful but should
> not be an objective of the RIF WG.

Yes.  The whole point of the RIF is to standardize one XML representation of 
(FOL) "not".  Note that we may need more than one "not" operator if a single 
ruleset can contain monotonic "not" and non-monotonic (negation-as-failure) "not".

> (b) Computational choices must dis-considered for otherwise we shall
> end up with defining a rule language, not only a rule format, and
> furthermore a rule language extending several other rule languages. It
> is doubtful whether this is at all possible. It is undoubtedly
> impossible within the time scale the RIF WG has.

The real issue here is: How much and what kind of interoperability do we want 
to guarantee?  Some computational elements may be needed to define the 
semantics of a ruleset in some model theories, namely those that attribute a 
notion of "time" (distinct points in time) to evaluation of rules.

> (c) More subtile is that some (not all!) semantic choices of query
> languages and rule languages/engines should be dis-considered. Let us
> consider three examples.
> 
> First, consider a rule with fuzzy truth values like:
> 
> R1 = if a:0.3 and b:0.2 then c:0.25
> 
> Obviously, how the fuzzy truth values are combined is an important
> part of the rule's semantics. As it is well-known, how fuzzy truth
> values are combined is one of the essential aspects that distinguish
> fuzzy rule languages. Therefore, it is desirable that RIF gives a way
> to express that rule R1 has been specified after fuzzy truth
> combination C.
> 
> However, rule R1 can also be used after the following (weaker) meaning:
> 
> R1' = if a is possible and b is possible then c is possible.
> 
> Arguably, RIF is needed precisely for conveying R1 making it possible for
> the recipient to interprete it a R1'.

To put it bluntly, I don't think the RIFWG should get within the blast radius 
of fuzzy logics.

> Second, consider the following rules with non-monotonic negation:
> 
> R2 = if not a then b
> R3 = if not b then a
> 
> The ruleset {R2, R3} might have been specified in a context where it is
> used under the well-founded semantics. Under the well-founded semantics,
> in the absence of other a- and b-headed rules, {R2, R3} means that a and
> b are unknown.
> 
> However, the ruleset {R2, R3} can also be used after the stable model
> semantics where it means a or b is true.

This is somewhat inaccurate, but the point is well-taken: The meaning of the 
ruleset is highly dependent on what François calls "computational semantics". 
  I would have said, however, that the distinction between stable-state 
semantics and "well-founded" semantics ("stratified dependency"?) is so 
significant as to be reflected in the model theory.

> Third, consider a rule stating that a computer science student not 
> attending the lecture on RIF must attend the lecture on Prolog.
> 
> This rule can be used for making a list of those student who should 
> attend the Prolog lecture. In this case, the rule negation is 
> non-monotonic negation.
> 
> This rule can also be used  in determining consequences of the 
> regulation without considering students and student registrations to 
> lectures.  In this case, the rule negation is monotonic negation.

We...ell, the rule is given in somewhat ambiguous natural language.
What exactly is meant by "must attend"?  One possibility is:
   (ALL ?s)(IMPLIES (NOT (attends-RIF ?s)) (attends-Prolog ?s) )
Another is:
   IF (NOT (attends-RIF ?s)) THEN (Attend-Prolog ?s);
where attends-RIF is a query, and Attend-Prolog is an action.
And still another is:
   (IMPLIES (NOT (attends-RIF ?s)) (Obligation (attends-Prolog ?s)))
And the interpretation of the deontic "Obligation" modal may be:
   (IMPLIES (NOT (attends-Prolog ?s)) (InvalidState ?s))

> These example show that there might be cases where a Web actor need the
> one semantics, another another semantics of non-monotonic negation.

Perhaps I am being thick-headed, but I cannot draw that conclusion from these 
examples.  All I see is that identical syntax can lead to radically different 
interpretation if the model theory underlying the sentence isn't somehow made 
explicit.

> This
> would be the case if the first actor, using the well-founded semantics,
> is interested in derivable facts (ie investigates only necessities), while
> the other actor is also interested in derivable disjunctions (eg for
> investigating possibilities).

IMO if the first actor intended a particular interpretation, and the second 
actor presumes another interpretation, the second interpretation can only be 
consistent by dumb luck.  In general, if the second actor has no idea what 
model theory underlies the ruleset given by the first actor, the second actor 
may impose his own model theory on it, but any inferences he makes are likely 
to be inconsistent with the original.  And the ambiguous third example 
supports this.

Conversely, if the second actor knows what model theory underlies the ruleset 
given by the first actor, the second actor may know how to interpret that 
model theory (or some subset of it), and thus the ruleset, to give meaningful 
results in the second actor's preferred model theory.

So I don't deny the value of having the possibility of different model 
theories.  I just want the ruleset to be explicit about the model theory that 
would give it the intended interpretation.

> Of course, RIF cannot be free of declarative semantics. A RIF "and"
> must mean a logical "and", a RIF fuzzy truth value must have a meaning
> and a RIF "non-monotonic not" must mean a "non-monotonic not". But
> these meanings must leave ways open to re-interpretations --
> admittedly within a yet to define reasonable scope.

(It may be that François and I agree on this.  I just don't understand his way 
of presenting this concept.)

> 3. The RIF lingua franca should offer a way to procedurally attach
> SPARQL queries, and queries expressed in other query languages, in rule
> bodies.

s/bodies/antecedents/ and I will agree.  (François later says "bodies" = 
"condition parts".)

I understand "procedurally attach" here to be the common term in rule engine 
land for what Harold called "external functions".  What François describes 
below is about "external functions" in general, but there is a special problem 
with the syntax of query languages vs the syntactic elements of the RIF.

> RIF could support such procedural attachments under the following
> assumptions:
> 
> - Expressions in a query language may appear in rule bodies
> (= condition parts) with an interface stating sets of bindings they
> provide for logical variables (occurring elsewhere in the rule).

I need to understand what François has in mind by "expression in a query 
language".  I can understand how something like:
  (SPARQL 'RDFbase '<query text> <rule-expression> ... <rule-expression>)
could work as an "external query", where:
- SPARQL designates a particular query service (or a class of query service 
for which the rule engine is to find a server in conjunction with the 
"RDFbase" parameter)
- RDFbase designates the KB on which the service is to operate for this query
- <query text> is a query stated in the language of the designated service 
that may contain references to "external parameters" using the "external 
parameter syntax" *for that query language*
- <rule-expression> is an operand expression in the rule language (RIF) 
syntax.  The expression is evaluated by the rule engine before invocation of 
the external query service, and the result of the evaluation is passed in the 
position in which the expression occurs, i.e. an "actual parameter".

If the query language provides for references to external parameters 
"by-position", the actual parameter substitution rule is obvious.  If the 
references are "by-name", then the sequence of rule-expression parameters 
becomes a sequence of '<name> <rule-expression>, but the rule engine need not 
know that.

What François describes below is the assignment of responsibility for the RIF 
service invocation syntax, the query service (WSDL) definitions, and the 
<query text> syntax (I think).

> - It is the responsibility of the query language
> designers/implementers to provide with such an interface.

That is, the SPARQLers (and their competitors) have to define a webservice 
interface for their query engines.

> - It is the responsibility of the RIF designer to specify the format
> of the interface.

That is, RIF defines the form of a query invocation *as it appears in a rule*.

What is not stated is that the RIF-compliant engine has to know how to map the 
RIF query invocation to the SPARQL webservice invocation, and to the SQL/CLI 
invocation, and to the KQML webservice invocation, etc.  Each of those 
mappings will probably be somewhat different.  So an engine cannot really be 
blind to the nature of the Query Service.  It can be blind to the syntax of 
the query language.  (The engine could, of course, define its own "standard 
mapping for unknown query services", and expect the user to provide an 
intercept routine that converts that invocation into the proper invocation for 
the user's favorite query service.)

> - It is the responsibility of users of RIF, ie RIF programmers, to
> ensure that the procedurally attached queries make sense. Especially,
> RIF does not have to provide with a declarative semantics
> considering/encompassing that of query languages.

Of course.  But there is also a serious constraint on "external queries":
What is the relationship between the knowledge base for the "external query" 
and the knowledge base for the rule engine?

If the "external query" interrogates an *external* KB, then the "external 
query function" has encapsulated (and therefore probably well-defined) 
behavior.  But if the "external query" operates on the same KB *concurrently 
with* the rule engine, and the "action"/"consequent" part of a rule can modify 
the KB, the issue of *timing* becomes a part of the interpretation model.  In 
database land, this leads to all kinds of meta-rules and "quarantining" and 
"transaction semantics" and other "things too fierce to mention".

IMO, RIF should define a query invocation syntax and assume that the 
referenced KB is entirely external, or at least not modified in any relevant 
way by the actions of the ruleset.  If this is not the case in some instance, 
in the words of Algol68, "the further elaboration of [ruleset inferencing] is 
not defined by this standard."

-Ed

P.S.  This is an area in which we have to deal with ?'s Corollary to the Gödel 
Theorem:  "You can't allow everything you want without allowing things you 
don't want."

-- 
Edward J. Barkmeyer                        Email: edbark@nist.gov
National Institute of Standards & Technology
Manufacturing Systems Integration Division
100 Bureau Drive, Stop 8263                Tel: +1 301-975-3528
Gaithersburg, MD 20899-8263                FAX: +1 301-975-4482

"The opinions expressed above do not reflect consensus of NIST,
  and have not been reviewed by any Government authority."
Received on Wednesday, 25 January 2006 20:11:14 UTC