Re: safety and external predicates from Axel Polleres on 2008-08-14 (public-rif-wg@w3.org from August 2008)

From: Axel Polleres <axel.polleres@deri.org>
Date: Thu, 14 Aug 2008 16:28:54 +0100
To: Chris Welty <cawelty@gmail.com>
CC: "Public-Rif-Wg (E-mail)" <public-rif-wg@w3.org>
Message-ID: <48A44F36.9030804@deri.org>
Chris Welty wrote:
> 
> I still am not convinced that safeness is anything more than an academic 
> requirement for CORE. 

Even if my background is academic, let me try to roll up my previous 
arguments made so far from what I'd consider an implementation-oriented 
viewpoint of CORE.

> I would like to hear from someone who is a) 
> interested in CORE and b) has some idea what an implementation is and c) 
> has some idea what users of CORE would need, to let us know if these 
> requirements matter:

I am basing my statements on the dlv [1] implementation, which is a 
deductive database system.  For the fragment it overlaps with BLD, dlv 
covers datalog, i.e. function-free Horn rules. Some simple  built-ins 
are allowed in native dlv.

Roughly, dlv requires safety of ruleheads and for built-ins (which 
correspnd to our external predicates), in the sense that each variable 
in a rulehead or in a built-in is required to also appear in a 
(non-external) body atom.

Safety is not strictly necessary for Datalog, since you could always 
replace it with adding a predicate HU(X) to each rule for each 
predicate, which ranges over all elements of the Herbrand Universe,
i.e. for implementing a fwd- or bwd- chaining solution that returns
all ground entailed answers, you just need to parse the program, and 
collect the Herbrand universe, ie. all constants appearing in the 
program in relation HU(), and you can rewrite each non-safe rule:

a(X) :- b(Y).

to

a(X) :- b(Y), HU(X), HU(Y).

and you'd still get the same ground entailments. no problem, safety not 
required for implementing this.

Note however, that this work-around would not have the expected behavior
if external atoms come into play which have an infinite extension, e.g.

  a(Z) :- External ( add(X,Y,Z) ).

What should be returned here? for

?- a(X).

(forgive me prolog notation, I could have used an exists condition in 
RIF of course.)

a) If you say: "All the possible results of additions"
then that's nasty, because that would lead to non-termination.
b) If you say: elements of HU that possibly appear as results of an 
addition, that might be feasible to implement, but I am neither sure 
whether this is what we want nor whether this is at all intuitive, since 
this view would, on the single line program P

  a(Z) :- External ( add(1,1,Z) ).

not return any answer.

The "intuitive" result would be that you get a(2), but 2 is not an 
element of HU.

Obviously the last example shows that safety is two things:
a) the HU "trick" doesn't work for External predicates
b) safety in the strict sense is too strict for External predicates.

Actually, probably most people would agree that P has a finite set of 
ground entailments, and that the answer a(2) should be found.

  If you know binding patterns however that guarantee that an external 
predicate has finitely many ground entailments if certain parameters of 
it are bound to fixed values, than you could allow unsafe use of 
variables, as long as finitely many ground entailments of programs are 
still guaranteed. Obviously the binding patterns for add(X,Y,Z)
would be:  bbb, bbf,bfb, fbb (where b means "bound" and f means "free")

So, since in the program P above add() is used with safe binding bbf, 
everything would be fine.

However, another complication comes from recursion, since recursive use 
doesn't guarantee finiteness, even if finite bindings are enforced.
here goes the example from the RIF IRC chat:

  a(Z) :- External ( add(X,1,Z) ), a(X).
  a(1) .

Again, that would mean an infinite extension of predicate a(),
i.e. the ground entailments can't be finitely computed.

So, my desired restriction (and this is basically what can be 
implemented (and is implemented in an extension of dlv), would be that
External predicates can be used non-recursively with safe bindings.

  This is the safety condition that was discussed in earlier mails
and which guarantees finiteness of the result while still allowing 
examples like P.

Summary: I'd opt for CORE being Datalog programs with

OPTION 1: strictly safe use of external predicates
(if we *can't* agree on the need for binding patterns for core)

OPTION 2: non-recursive safe bindings use of external predicates
(if we *can* agree on the need for binding patterns for core)


Both these subsets are implementable, and the latter seems to
be the upper limit of what is reasonable implementable fwd-chaining 
based rules engines, which IMO makes either of the two a reasonable CORE 
from an implementation perspective, i.e. something that most engines 
could reasonably implement.

At least [1] does not implement more (it implements obviously a lot of 
other features, outside BLD) than that and it likely will not, even if 
we define CORE to be more. If the rest of the group considers these 
considerations purely academic and claims are that all the typical 
non-acedemic systems are less restrictive, (not that I'd personally 
count on that) fair enough.

HTH,
Axel

1. http://www.dlvsystem.com

> 1) Decidability: is is important that RIF-Core have decidable reasoning? 
> That is, any compliant RIF-Core reasoner (implementation) will be 
> guaranteed to terminate on any rule-set?

I would draw the line tighter: I would require finiteness of ground 
entailments rather than only decidability (Decidability of WHAT exact 
problem are you after, btw?)



> 2) If decidability is a requirement, is tractability?  That is, any 
> implementation will terminate in worst-case polynomial time (or better?)
> 
> My general impression from talking to some potential RIF implementors is 
> that they treat rule-bases like programs - if your programs don't work 
> its your fault, go fix them.  However one important difference between 
> rule/logic "programs" and procedural programs is the amount of control 
> you have over the search strategy.  I think this is (a practical reason) 
> why decidability is considered by some to be important for these 
> languages and not for e.g. Java.
> 
> -Chris
> 
> Axel Polleres wrote:
>>
>> Two pointers here... the notion of strong safety in hex-programs [1,2] 
>> and Topor's considerations on  safe database queries with arithmetics 
>> [3] (cudos jos for the latter one)
>>
>>
>> 1. R. Schindlauer. Answer-Set Programming for the Semantic Web. PhD 
>> thesis, Vienna University of Technology, Dec. 2006.
>> http://www.kr.tuwien.ac.at/staff/roman/papers/thesis.pdf
>>
>> 2.  Thomas Eiter, Giovambattista Ianni, Roman Schindlauer, and Hans 
>> Tompits. Effective Integration of Declarative Rules with External 
>> Evaluations for Semantic Web Reasoning. In York Sure and John 
>> Domingue, editors, Proceedings of the 3rd European Conference on 
>> Semantic Web (ESWC 2006), Budva, Montenegro, number 4011 in Lecture 
>> Notes in Computer Science (LNCS), pages 273-287. Springer, June 2006.
>> http://www.springerlink.com/content/f0x23wx142141v44/
>>
>> 3. R. Topor. Safe database queries with arithmetic relations (1991)
>> Proc. 14th Australian Computer Science Conf 
>> http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.48.4845


-- 
Dr. Axel Polleres, Digital Enterprise Research Institute (DERI)
email: axel.polleres@deri.org  url: http://www.polleres.net/

Everything is possible:
rdfs:subClassOf rdfs:subPropertyOf rdfs:Resource.
rdfs:subClassOf rdfs:subPropertyOf rdfs:subPropertyOf.
rdf:type rdfs:subPropertyOf rdfs:subClassOf.
rdfs:subClassOf rdf:type owl:SymmetricProperty.
Received on Thursday, 14 August 2008 15:29:40 UTC