Re: model theory for RDF/S from Peter F. Patel-Schneider on 2001-09-27 (www-rdf-logic@w3.org from September 2001)

From: Peter F. Patel-Schneider <pfps@research.bell-labs.com>
Date: Thu, 27 Sep 2001 10:37:29 -0400
To: phayes@ai.uwf.edu
Cc: www-rdf-logic@w3.org
Message-Id: <20010927103729F.pfps@research.bell-labs.com>
I thought I might give a stab at cleaning up some of the problems that I
see with Pat's model theory.  Here are the results for RDF (without
reification or containers).  Note that this should be considered a
derivative work from Pat's work, and not a completely separate model
theory.

I have several themes here.  First, I wanted to be very formal, so no
misconceptions can arise.  To this end, I have produced a very sparse
document.  (However, I have had to use some non-standard symbology for the
set-theoretic relationships.  Make the obvious fixes.)  Second, I wanted
to stick as close as possible to the ``Formal Model for RDF''.  This drove
some of the changes from Pat's model theory.  Third, I wanted to annotate
some of the issues and choices in the model theory.


Syntax:

URI is the collection of URI names, i.e., some collection of strings

  [NB:  This ignores all aspects of the structure of URIs.]

L is the collection of literals, i.e., some collection of strings, 
disjoint from URI

  [NB:  This ignores non-string literals, but they wouldn't cause
	any more problems.] 

  [NB: I'm sticking with graphs, even though they have some problems for
       RDF.  Triples have their own problems with respect to RDF.]

A basic untidy RDF graph, R, is a four-tuple (that can be considered to be a
partially node labeled, totally edge labeled, directed graph) 
		< N, E, LN, LE >
	where N is the set of nodes in the graph
	      LN :(partial) N -> URI u L gives labels for nodes
	      LE :(total) E -> URI gives labels for edges
	      E <= N' x N is the set of edges in the graph
		where N' = { n : LN(n) in URI }

  [NB: This accounts for literals not being allowed as labels of edges, nor
	as the labels of nodes that are heads of edges, but does not
	account for edge labels being properties.  It would be possible to
	make the latter distinction here.]

  [NB: I don't think that Pat's definition of RDF graphs has the
       requirement that heads of edges cannot be labeled with literals.]

An basic untidy RDF graph is ground if LN is a total function on N.
An basic untidy RDF graph is tidy (also called a basic RDF graph) if LN is
injective (not necessarily total).


Literal Values:

LV is a non-empty set, currently the set of strings.  
XL : L -> LV, giving meaning to literals

  [NB: This ignores many aspects of the intended meaning of strings,
       such as whether the mapping is surjective or injective.]


Interpretations:

Let V be a subset of URI.
	
  [NB: Does V have to be non-empty, does it have to be finite?  So far
       there are no such requirements, but we have to be careful that this
       is OK, and that we don't make unnecessary requirements on V.]

An interpretation, I, on a vocabulary V, is a four-tuple < IR, IP, IEXT, IS>
	where IR is a non-empty set, called resources
	      IP <= IR, called properties
	      IEXT : IP -> powerset ( IR x (IR u LV) )
	      IS : V -> IR

  [NB: There is no requirement that IR and LV be disjoint nor is there any
       requirement that they be subsets of one another.   We have to be
       careful that this is OK, and that we don't accidentally disturb this
       (non-)relationship.]


Interpretations of ground basic untidy RDF graphs:

Let R = < N, E, LN, LE > be a ground basic untidy RDF graph.

  [NB: There appears to be no harm in allowing untidy graphs here, but this
       could easily be restricted to tidy graphs.]

An interpretation I on the vocabulary V with 
	{ LN(n) | n in N } ^ URI <= V
	{ LN(e) | e in E } <= V
is extended to R as follows

  [NB: I am being a bit vague here in just what is the domain of I.]

  [NB: It is explicit here that the vocabulary of the interpretation can
       have ``names'' that do not appear in the graph.  Pat's theory is
       vague on this point.] 

1. if LN(n) is defined and in L then I(n) = XL(LN(n))

2. if LN(n) is defined and in V then I(n) = IS(LN(n))

   [NB: These two give an interpretation for all nodes in R.]
   [NB: The ``defined'' part is used later.]

3. if <f,g> is in E
   then I(E) = true if IS(LE(<f,g>)) in IP and <I(f),I(g)> in IEXT(IS(LE(<f,g>))),
        I(E) = false otherwise

   [NB: This cleans up the meaning of edges whose labels are not mapped to
	properties.  It might also be possible to make graphs with such
	edges ill-formed in some way.]

4. I(R) = false if I(e) = false for some e in E,
   I(R) = true otherwise


Interpretations of basic untidy RDF graphs:

  [NB: I'm writing out extended interpretations completely for purposes of
       being completely pedantic.  

An extended interpretation on a vocabularly V is a two-tuple
	<<IR,IP,IEXT,IS>,A'>  
where <IR,IP,IEXT,IS> is an interpretation on V 
  and A' is a mapping to IR. 

  [NB: This means that unnamed nodes have to be resources, not literals
       that are not resources.  This may have consequences!]

Let R = < N, E, LN, LE > be a basic untidy RDF graph and 
let U = { n in N | LN(n) is not defined }

An extended interpretation I = <<IR,IP,IEXT,IS>,A'> on the vocabulary V with 
	{ LN(n) | n in N } ^ URI <= V
	{ LN(e) | e in E } <= V
	and the domain of A' = U
is extended to R as follows

0. if LN(n) is not defined I(n) = A'(n)

1. if LN(n) is defined and in L then I(n) = XL(LN(n))

2. if LN(n) is defined and in V then I(n) = IS(LN(n))

3. if <f,g> is in E
   then I(E) = true if IS(LE(<f,g>)) in IP and <I(f),I(g)> in IEXT(IS(LE(<f,g>))),
        I(E) = false otherwise

4. I(R) = false if I(e) = false for some e in E,
   I(R) = true otherwise


Let R = < N, E, LN, LE > be a basic untidy RDF graph and 
let U = { n in N | LN(n) is not defined }

An interpretation I = <IR,IP,IEXT,IS> on the vocabulary V with 
	{ LN(n) | n in N } ^ URI <= V
	{ LN(e) | e in E } <= V
is extended to R as follows:

  I(R) = true if there is some mapping A' from U to IR such that <I,A'>(R) = true,
  I(R) = false otherwise



An interpretation I with vocabulary V is called a model of a 
basic untidy RDF graph R = <N,E,LN,LE> iff  
1. 	{ LN(n) | n in N } ^ URI <= V
	{ LN(e) | e in E } <= V, and
2. I(R) = true



Taking care of rdf:type:

A core RDF interpretation, i.e., RDF without reification or containers, is
an interpretation over a vocabularly that includes rdf:type with the
following extra conditions 

     1. IS(rdf:type) is in IP
     2. IEXT(IS(rdf:type)) <= IR x IR


Now for the claims:

All these sorts of claims have to be backed up with theorems and lemmas
like the ones that Pat has.  I would go even further than Pat has,
extending to RDF query languages, at least whenever such become
well-defined.

Claim 1:

A core RDF interpretation that is a model of a basic untidy RDF graph R
contains everything (and more) that is in the intended core RDF (i.e., RDF
without reification or containers) meaning of R.  In other words, a model
contains all the information implicitly (or explicity) in the graph, and
maybe more, and contradicts nothing that is implicitly (or explicitly) in
the graph.

Claim 2:

A core RDF interpretation that is not a model of a basic untidy RDF graph R
has something that is not compatible the intended core RDF meaning of R.
In other words, a non-model is missing or incorrect on something that is
implicitly (or explicitly) in the graph.

Claim 3:

For every basic untidy RDF graph R there is a core RDF interpretation that
captures exactly the closure of the intended core RDF meaning of R and that
is a model for R.  That is, roughly, that there is a model that makes
everything implicitly (or explicitly) in the graph true, and everything
else false.

  [NB:  This claim is not true for more-powerful representation systems.]
Received on Thursday, 27 September 2001 10:35:53 UTC