# Re: model theory for RDF/S

From: Peter F. Patel-Schneider <pfps@research.bell-labs.com>
Date: Thu, 27 Sep 2001 10:37:29 -0400

Message-Id: <20010927103729F.pfps@research.bell-labs.com>
```I thought I might give a stab at cleaning up some of the problems that I
see with Pat's model theory.  Here are the results for RDF (without
reification or containers).  Note that this should be considered a
derivative work from Pat's work, and not a completely separate model
theory.

I have several themes here.  First, I wanted to be very formal, so no
misconceptions can arise.  To this end, I have produced a very sparse
document.  (However, I have had to use some non-standard symbology for the
set-theoretic relationships.  Make the obvious fixes.)  Second, I wanted
to stick as close as possible to the ``Formal Model for RDF''.  This drove
some of the changes from Pat's model theory.  Third, I wanted to annotate
some of the issues and choices in the model theory.

Syntax:

URI is the collection of URI names, i.e., some collection of strings

[NB:  This ignores all aspects of the structure of URIs.]

L is the collection of literals, i.e., some collection of strings,
disjoint from URI

[NB:  This ignores non-string literals, but they wouldn't cause
any more problems.]

[NB: I'm sticking with graphs, even though they have some problems for
RDF.  Triples have their own problems with respect to RDF.]

A basic untidy RDF graph, R, is a four-tuple (that can be considered to be a
partially node labeled, totally edge labeled, directed graph)
< N, E, LN, LE >
where N is the set of nodes in the graph
LN :(partial) N -> URI u L gives labels for nodes
LE :(total) E -> URI gives labels for edges
E <= N' x N is the set of edges in the graph
where N' = { n : LN(n) in URI }

[NB: This accounts for literals not being allowed as labels of edges, nor
as the labels of nodes that are heads of edges, but does not
account for edge labels being properties.  It would be possible to
make the latter distinction here.]

[NB: I don't think that Pat's definition of RDF graphs has the
requirement that heads of edges cannot be labeled with literals.]

An basic untidy RDF graph is ground if LN is a total function on N.
An basic untidy RDF graph is tidy (also called a basic RDF graph) if LN is
injective (not necessarily total).

Literal Values:

LV is a non-empty set, currently the set of strings.
XL : L -> LV, giving meaning to literals

[NB: This ignores many aspects of the intended meaning of strings,
such as whether the mapping is surjective or injective.]

Interpretations:

Let V be a subset of URI.

[NB: Does V have to be non-empty, does it have to be finite?  So far
there are no such requirements, but we have to be careful that this
is OK, and that we don't make unnecessary requirements on V.]

An interpretation, I, on a vocabulary V, is a four-tuple < IR, IP, IEXT, IS>
where IR is a non-empty set, called resources
IP <= IR, called properties
IEXT : IP -> powerset ( IR x (IR u LV) )
IS : V -> IR

[NB: There is no requirement that IR and LV be disjoint nor is there any
requirement that they be subsets of one another.   We have to be
careful that this is OK, and that we don't accidentally disturb this
(non-)relationship.]

Interpretations of ground basic untidy RDF graphs:

Let R = < N, E, LN, LE > be a ground basic untidy RDF graph.

[NB: There appears to be no harm in allowing untidy graphs here, but this
could easily be restricted to tidy graphs.]

An interpretation I on the vocabulary V with
{ LN(n) | n in N } ^ URI <= V
{ LN(e) | e in E } <= V
is extended to R as follows

[NB: I am being a bit vague here in just what is the domain of I.]

[NB: It is explicit here that the vocabulary of the interpretation can
have ``names'' that do not appear in the graph.  Pat's theory is
vague on this point.]

1. if LN(n) is defined and in L then I(n) = XL(LN(n))

2. if LN(n) is defined and in V then I(n) = IS(LN(n))

[NB: These two give an interpretation for all nodes in R.]
[NB: The ``defined'' part is used later.]

3. if <f,g> is in E
then I(E) = true if IS(LE(<f,g>)) in IP and <I(f),I(g)> in IEXT(IS(LE(<f,g>))),
I(E) = false otherwise

[NB: This cleans up the meaning of edges whose labels are not mapped to
properties.  It might also be possible to make graphs with such
edges ill-formed in some way.]

4. I(R) = false if I(e) = false for some e in E,
I(R) = true otherwise

Interpretations of basic untidy RDF graphs:

[NB: I'm writing out extended interpretations completely for purposes of
being completely pedantic.

An extended interpretation on a vocabularly V is a two-tuple
<<IR,IP,IEXT,IS>,A'>
where <IR,IP,IEXT,IS> is an interpretation on V
and A' is a mapping to IR.

[NB: This means that unnamed nodes have to be resources, not literals
that are not resources.  This may have consequences!]

Let R = < N, E, LN, LE > be a basic untidy RDF graph and
let U = { n in N | LN(n) is not defined }

An extended interpretation I = <<IR,IP,IEXT,IS>,A'> on the vocabulary V with
{ LN(n) | n in N } ^ URI <= V
{ LN(e) | e in E } <= V
and the domain of A' = U
is extended to R as follows

0. if LN(n) is not defined I(n) = A'(n)

1. if LN(n) is defined and in L then I(n) = XL(LN(n))

2. if LN(n) is defined and in V then I(n) = IS(LN(n))

3. if <f,g> is in E
then I(E) = true if IS(LE(<f,g>)) in IP and <I(f),I(g)> in IEXT(IS(LE(<f,g>))),
I(E) = false otherwise

4. I(R) = false if I(e) = false for some e in E,
I(R) = true otherwise

Let R = < N, E, LN, LE > be a basic untidy RDF graph and
let U = { n in N | LN(n) is not defined }

An interpretation I = <IR,IP,IEXT,IS> on the vocabulary V with
{ LN(n) | n in N } ^ URI <= V
{ LN(e) | e in E } <= V
is extended to R as follows:

I(R) = true if there is some mapping A' from U to IR such that <I,A'>(R) = true,
I(R) = false otherwise

An interpretation I with vocabulary V is called a model of a
basic untidy RDF graph R = <N,E,LN,LE> iff
1. 	{ LN(n) | n in N } ^ URI <= V
{ LN(e) | e in E } <= V, and
2. I(R) = true

Taking care of rdf:type:

A core RDF interpretation, i.e., RDF without reification or containers, is
an interpretation over a vocabularly that includes rdf:type with the
following extra conditions

1. IS(rdf:type) is in IP
2. IEXT(IS(rdf:type)) <= IR x IR

Now for the claims:

All these sorts of claims have to be backed up with theorems and lemmas
like the ones that Pat has.  I would go even further than Pat has,
extending to RDF query languages, at least whenever such become
well-defined.

Claim 1:

A core RDF interpretation that is a model of a basic untidy RDF graph R
contains everything (and more) that is in the intended core RDF (i.e., RDF
without reification or containers) meaning of R.  In other words, a model
contains all the information implicitly (or explicity) in the graph, and
maybe more, and contradicts nothing that is implicitly (or explicitly) in
the graph.

Claim 2:

A core RDF interpretation that is not a model of a basic untidy RDF graph R
has something that is not compatible the intended core RDF meaning of R.
In other words, a non-model is missing or incorrect on something that is
implicitly (or explicitly) in the graph.

Claim 3:

For every basic untidy RDF graph R there is a core RDF interpretation that
captures exactly the closure of the intended core RDF meaning of R and that
is a model for R.  That is, roughly, that there is a model that makes
everything implicitly (or explicitly) in the graph true, and everything
else false.

[NB:  This claim is not true for more-powerful representation systems.]
```
Received on Thursday, 27 September 2001 10:35:53 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 2 March 2016 11:10:36 UTC