- From: Peter F. Patel-Schneider <pfps@research.bell-labs.com>
- Date: Mon, 17 Sep 2001 13:00:37 -0400
- To: sandro@w3.org
- Cc: www-rdf-rules@w3.org
From: Sandro Hawke <sandro@w3.org> Subject: Re: What is an RDF Query? Date: Mon, 17 Sep 2001 10:56:52 -0400 > [...] > I think we ought to seriously explore the possibility of using an RDF > graph not only as the assertion language, but also as the query, rule- > premise, and rule-conclusion languages. I'm sure it wont be perfect, > but this seems like an important idea to consider and a strawman > against which to compare alternatives. From here, I'd be interested > to hear what you (Peter) think is still needed. Yes, sure you can do this, and, in particular, you provide two ways of interpreting such rules built in this way. > Since I have trouble thinking in RDF graphs sometimes, I translate > them to a small subset of FOL for this kind of work. Specifically, I > use the subset with only constants, existential variables, 2-ary > predicates, and conjunction. (No universal variables, no negation, no > disjunction, no equality, and no logic functions.) Again, this sort of thing is possible, and provides good insights into the meaning of RDF(S). However RDFS is not captured by this subset of FOL, as RDFS requires that rdfs:subPropertyOf be transitive. To make this intuition precise, you have to capture all of whatever you are considering, as even very small changes in the formalism have large consequence. > Using this language for assertions is obvious. Well, fairly obvious, see above. > Using it for queries > nearly obvious: you ask for a proof of the formula. I think that you don't want proofs, instead you want something more akin to bindings. > Using it for a > rule-premise or rule-conclusion gets a little more tricky, because we > want to sort-of join the scopes of the existential variables in the > two parts. > I think there are two fairly natural ways to do this: one > seems to give us datalog rules and the other seems to give us Horn > rules. You have to be quite careful about what you are stating here, as there are several versions of Horn rules, including: [Binary] Horn 1: Quantifier-free formula in conjunctive normal form with each conjunction having at most one positive literal, [and all predicates binary] , with variables read universally. This is not what you want, as its assertional component is strictly more powerful than RDFS. [Binary] Horn 2: Conjunction of positive ground formula and Horn rules, [with all predicates binary]. A Horn rule is a quantifier-free conjunction of literals, with exactly one positive conjunct and at least one negative conjunct, with the variables being read universally. This is closer, as the assersional component is closer to RDF, but you seem to want to have existentials in your non-rules. [Binary] Horn 3: Conjunction of assertions and Horn rules, [with all predicates binary]. The assertions are positive literals, with free variables read existentially. The Horn rules are as in 2. This is even closer, but not exactly what you seem to want. > Style 1: > > Simply combine the scope of variables in the premise and conclusion. > So if we have two RDF graphs (in FOL notation [2]): > > (1) EXISTS x In(Spot, x) AND In(x, TheYard) AND Material(x, Wood) > (2) EXISTS x Isa(x, DogHouse) > > we can use these in a rule by turning the existential variables into > universal variables in a scope outside of both formulas: > > FORALL x > In(Spot, x) AND In(x, TheYard) AND Material(x, Wood) > => > Isa(x, DogHouse) > > I think this is as expressive as datalog. There is a question as > to what it means if a variable occurs only in the second clause. In > this style, I think we simply disallow that case. So rule conclusions are not arbitrary RDF, in other words, so you have lost your primary goal. > Style 2: > > We say that any variables in the second formula which are not also > in the first formula remain as local existential variables. > > (1) EXISTS g1,g3 Grandparent(g1,g3) > (2) EXISTS g1,g2,g3 Parent(g1,g2) AND Parent(g2,g3) > > would become a rule like this: > > FORALL g1,g3 > Grandparent(g1,g3) > => > EXISTS g2 Parent(g1,g2) AND Parent(g2,g3) > > (This rule basically says that if two things have a grandparent > relationship, there is a third thing and they have a transitive > parent relationship through that third thing.) > > This may well not be full [binary] Horn logic, but I believe it's more > expressive than Style 1, since I can't figure out how to express > this example in Style 1. [Note that although binary FOL can emulate FOL, the relationship between binary first-order Horn and full first-order Horn may be different. I can't remember the relationship off the top of my head.] I also believe Style 2 is strictly more powerful than Style 1. However, the situation is not obvious, at least for RDFS, as RDFS may not be Horn. > I also can't think of any [binary] Horn examples > that I can't express in Style 2. (While Style 2 doesn't have the > logic functions which normally separate datalog and Horn logic, the > nested existential corresponds to Skolem functions, and in my > playing around that seems to be enough. I don't have the skills > to approach this formally.) > (I have implemented Style 2, and I've talked about this before [1] but > that discussion went in a somewhat different direction.) > > -- sandro > > [1] http://lists.w3.org/Archives/Public/www-rdf-logic/2001Mar/0075.html > [2] I used an FOL ascii syntax found on this simple introduction: > http://www.sdsc.edu/~tbailey/teaching/cse151/lectures/chap07a.html The above is a reasonable approach to rules in RDFS, but there are lots of issues that have to be addressed. Let me make a stab at it, while still staying close to what you seem to want. 1/ An RDF assertional knowledge base is a collection of binary atomic formulae, with existential formulae read existentially. That is, an RDF knowledge base is E x1, ..., xn B1(t11,t12) ^ ... ^ Bm(tm1,tm2) for n>=0, m>0, where tij is one of the xi or some constant. [Note that this ignores lots of the stuff in RDF (including reification and containers).] 2/ An RDFS knowledge base is an RDF knowledge base plus the following formulae .... [ lots of stuff here, some Horn, some maybe not Horn, maybe even some not first-order ] ... 3/ An RDF rule is a formula of the form A x1, ..., xn E y1, ..., yo P1(t11,t12) ^ ... ^ Pm(tm1,tm2) -> Pm+1(tm+11,tm+12) ^ ... ^ Pl(tl1,tl2) for n>=0, m>0, l>m, where tij is one of the xi or one of the yi or some constant and each xi that appears as one of the tjk for j>m must also appear as at least one of the tjk for j<=m. 4/ An RDF(S) rule knowledge base is an RDF(S) assertional knowledge base plus a collection of RDF rules. OK, now we have the syntax. Note that neither rule premises nor conclustions are precisely the same as assertions. However, we do not yet have semantics. Fortunately we can just appeal to FOL semantics. A model of an RDF(S) rule knowledge base, KB, is precisely an FOL model of KB. Finally, what do we want to do with this? Well I don't really know, but there are several things that can be done. 1/ Is there a model of an RDF(S) rule knowledge base. Note that this is not a trivial question! (Suppose that the rule conclusion introduces a subclass loop.) 2/ Does an RDF(S) rule knowledge base entail (in the standard sense) an RDF(S) assertional knowledge base. 3/ Given an RDF(S) rule knowledge base, R, and a conjunction of binary atomic formulae Q = B1(t11,t12) ^ ... ^ Bm(tm1,tm2) where tij is either a variable or a constant, determine all the mappings M from the free variables of Q to constants in R or Q such that R entails Q/M (i.e., Q with its free variables mapped by M). Note that there are several other possibilities for this retrieval operation. Hopefully you see some of the complications in defining a rule system on top of RDF(S). If you make rules be first-order implications then some of the issues that could be ignored in RDF(S) can no longer be ignored. If you make rules some other sort of beast, then you can no longer use first-order logic as an intended meaning, and you have to make sure that your intuitions are still valid. Peter F. Patel-Schneider Bell Labs Research
Received on Monday, 17 September 2001 13:01:38 UTC