Constraints explained (was: RIF needs different reasoning methods) from Michael Kifer on 2006-03-13 (public-rif-wg@w3.org from March 2006)

From: Michael Kifer <kifer@cs.sunysb.edu>
Date: Mon, 13 Mar 2006 01:39:44 -0500
To: Sandro Hawke <sandro@w3.org>
Cc: public-rif-wg@w3.org
Message-ID: <12499.1142231984@kiferserv.kiferhome.com>
> My understanding is that derivation rules and integrity constraints are
> just a bifurcation of Horn rules (or sometimes the two halves of
> something beyond Horn).  As such they fit easily into the same
> semantics.
> 
>    - sandro

Depends what you call "semantics". They have the same first-order
semantics in the sense that if you give me a first-order interpretation
then the definition of what it means for each such rule to hold in that
interpretation is the same in both cases.

But Horn-rules-as-constraints are reasoned with differently from
Horn-rules-as-derivation-rules.

One important thing to understand is that when we deal with constraints (in
the database/LP sense) then we are NO LONGER dealing with first-order logic.
Here is why.

Let D be a set of rules (not necessarily Horn) and C be a set of constraints.
D determines the set of first-order interpretations, S, which are of
interest to the user (who constructed D).

In *pure* first-order,
      S = {M| M|=D}.
i.e., S is the set of all first-order interpretations that satisfy D.

Now, when constraints in C come into play, their role is to *test* if the
interpretations of interest (i.e., the members of S) satisfy every formula
in C.

If S = {M| M|=D} (as in FOL) then the only C's that will be satisfied in
every interpretation in S are precisely such that D |= C. This is not
interesting (for DB/LP people). In practice, the constraints that are used
in databases and LP are NOT implied by the set of derivation rules. For
instance, I may have something like

parent(X,Y) :- mother(X,Y).
parent(X,Y) :- father(X,Y).
mother(a,b).                           (*)
father(c,b).
married(c,a).

and my constraint could be

   parent(X,Y), parent(Z,Y) => married(X,Z).             (**)

This constraint is NOT entailed by the above rules and facts because
there are interpretations that satisfy (*), but not (**).  However, people
who are indoctrinated in the database and/or LP school of thought will say
that (*) satisfies the constraint (**).

Why? Because when people talk about constraints they have a non-first-order
logic in mind. For them, the set S of D's models of interest is not the set
of ALL models of D, but the set of INTENDED models only. For Horn, this set
consists of just the minimal model of D. For more general D's, this could
be the set of "perfect" models, "stable" models, "well-founded" models,
etc.

So, when you say that something is a constraint, you are talking about a
formula that is intended to be *checked* in a set of models determined by
some other set of formulas (the rules in D) **AND** this set of models is
NOT the set of all models of D.

So, in sum, the reasoning methods that are applied to D are quite different
from those applied to C.

I suppose that this is what Francois wanted to convey when he proposed
annotating rules with reasoning methods, but the discussion got muddled
because particular implementation strategies of those reasoning methods got
mixed in.

So, reasoning methods (e.g., derivation rules vs. constraints, query
answering vs. query containment, etc.) are kosher annotations and should be
standardized to the extent possible (and reasonable) for
RIFWG. Implementation strategies (e.g., top-down vs bottom-up) are
non-kosher and should NOT be standardized.

That said, it doesn't mean that RIF should not have a mechanism for
providing implementation hints (in an ad hoc way). Here is why this is
useful.

If RIF will support sufficiently expressive sets of rules, then important
reasoning problems become undecidable. We already agreed to have Horn with
function symbols, and this is enough to have query answering (the most
important form of reasoning with rules, IMO) undecidable. So, when I am
sending a bunch of rules, I may know that some particular clever
combination of top-down, bottom-up, and sideways information passing
strategies can achieve termination for certain queries (there is a whole
literature on that). But the engine on the other end of the pipe may not
know and may not be able to figure this out on its own. The undecidability
result means that no engine will be able to always evaluate every query
that a smart guy on the other end of the pipe has a strategy for. But the
engine might be able to do so, if the guy communicates the evaluation
strategy along with the query. (Note: the eval strategy won't change the
semantics of the query -- it will just enable the engine to compute the
answer vs. not being able to compute the answer.)

So, if two parties come up with a way to communicate proof strategies among
themselves -- all the power to them. But we don't know what is THE right
way to represent and communicate such strategies (there isn't just one way,
clearly) so we shouldn't try to standardize this. We should just ENABLE it.


	--michael
Received on Monday, 13 March 2006 06:39:55 UTC