[Arch] RDF in RIF (II) from Jos de Bruijn on 2007-06-01 (public-rif-wg@w3.org from June 2007)

From: Jos de Bruijn <debruijn@inf.unibz.it>
Date: Fri, 01 Jun 2007 15:19:42 +0200
To: RIF <public-rif-wg@w3.org>
Message-ID: <46601CEE.7070303@inf.unibz.it>
Dear all,

In my earlier e-mail [1], and the following thread, we were discussing
the technical means of interoperating between RDF and RIF.  The proposed
means was an embedding of RDF graphs and (optionally) RDF(S) semantics
in RIF rules, skolemizing blank nodes.

In this e-mail I propose ways to use RIF to exchange rules about RDF
data. A text like this could eventually become part of the architecture
document [2]; it is highly related to the current section on data sets [3].

I distinguish between two usage patterns:
(a) using RDF graphs and RDFS ontologies as background knowledge for RIF
rules
(b) exchanging N3-like rules, which have generalized RDF triples (i.e.
RDF triples with variables) in the heads and bodies


== Using RDF(S) as background knowledge ==

RDF graphs and the RDFS ontologies may be used as background knowledge
for RIF rules.  This means that the RIF rules apply to the data in RDF
graphs, as well as additional information which may be inferred from the
RDFS ontologies.  Furthermore, the RDFS ontologies apply to conclusions
drawn from RIF rules.

An RIF rule set refers to an RDF graph as a data set which has the RDF
data model. Furthermore, this data set has a particular entailment
regime associated with it (e.g. simple, RDF, RDFS).
[This entailment regime could be an extension point for other languages
such as OWL; it would be worthwhile to investigate the relationship with
the notion of "entailment regime" in SPARQL [4].]

The reference to the RDF graph would be part of the metadata of the RIF
rule set. The vocabulary for this metadata could be an extension of the
one proposed in [3].
For the issue of data set identification, we can use the named graph
facility which was described in SPARQL, and have some kind of default
behavior for unnamed graphs, again following SPARQL.

The semantics associated with this reference is that the RDF graph is
embedded as a set of RIF facts, and the mentioned entailment regime is
axiomatized is a set of RIF rules, as proposed in [1]; these facts and
rules are virtually part of the rule set.  However, they would not
actually be encoded as facts and rule; only a references to the RDF
graph and the desired entailment regime are sent over the wire.


== Exchanging N3-like rules ==

With an N3-like rule I mean a rule whose body consists of a conjunction
of generalized RDF triples and whose head is a generalized RDF triple.

We will need suitable metadata to identify the fact that the rule set
corresponds to such rules.  References to external RDF graphs can be
dealt with using the means proposed above.

Whereas round-tripping was not an issue in the previously described
scenario, it is an important issue here.  Specifically, it is important
to be able to distinguish between blank nodes, literals, variables, and
URIs in the rules which are exchanged using RIF.

An N3-like rule is of the form:

(s0,p0,o0) :- (s1,p1,o1) and ... and (sn,pn,on).

Where si, pi, oi are blank nodes, literals, variables, or URIs.

A generalized triple can be embedded in a way similar to the proposed
embedding in [1]:

tr((s,p,o)) = tr(s)[tr(p)->tr(o)]

Literals, variables, and URIs can be embedded as follows:

tr("literal") = "literal"^^rdfs:Literal [for a discussion on typed
literals see [1]]

tr(?variable) = ?variable^^rdfs:Resource

tr(URI) = "URI"^^rif:iri

where rdfs:Literal and rif:iri are sub-sorts of rdfs:Resource.

For the embedding of blank nodes there are a number of issues:

- blank nodes in the body of the rule  obviously correspond to
existentially quantified variables in  the body, so we can use the
following translation:
tr(bNode-in-rule-body) = ?variable^^rdfs:Resource
However, this might cause problems for round tripping, where one might
want to distinguish blank nodes from variables. One possible solution is
to use a specific sort for that, e.g.
tr(bNode-in-rule-body) = ?variable^^rif:bnode

where rif:bnode is a sub-sort of rdfs:Resource.

Using real blank nodes in the head of a rule poses problems, since
existentially quantified variables are allowed in the head of a rule in
any rules language.

However, I believe that most N3-like rules languages do not allow blank
nodes in the head, but rather have some kind of notion like "rigid
bnode".  Such a rigid bnode can be encoded using a new constant of the
sort rdfs:Resource. However,  in this case there are similar round
tripping problems as with real bnodes  in the body of a rule.
Therefore, we might use a sort rif:rigid-bnode, which is a sub sort of
rdf:Resource, for these rigid bnodes:
tr(rigid-bnode-in-head) = rigid-bnode^^rif:rigid-bnode

We have to be very clear, of course, that we do not allow real bnodes in
the heads of rule, and we have to make a very clear  syntactic
distinction between real bnodes and rigid bnodes.


Best, Jos


[1] http://lists.w3.org/Archives/Public/public-rif-wg/2007May/0077.html

[2] http://www.w3.org/2005/rules/wg/wiki/Arch

[3] http://www.w3.org/2005/rules/wg/wiki/Arch/Data_Sets

[4] http://www.w3.org/TR/rdf-sparql-query/#sparqlBGPExtend
-- 
Please note my new email address:
                         debruijn@inf.unibz.it

Jos de Bruijn,        http://www.debruijn.net/
----------------------------------------------
In heaven all the interesting people are
missing.
  - Friedrich Nietzsche
Received on Friday, 1 June 2007 13:19:59 UTC