Re: new model theory for DAML+OIL from Peter F. Patel-Schneider on 2001-10-11 (w3c-rdfcore-wg@w3.org from October 2001)

From: Peter F. Patel-Schneider <pfps@research.bell-labs.com>
Date: Wed, 10 Oct 2001 22:31:29 -0500
To: w3c-rdfcore-wg@w3.org
Message-Id: <p05101054b7eabfd381d4@[205.160.76.193]>
I am forwarding this to the WG for general interest and as a record 
of the current discussions. This follows his posting to RDF-comments 
at
http://lists.w3.org/Archives/Public/www-rdf-comments/2001OctDec/0057.html

<<I have interspersed some comments of my own.>>

This gets interesting in sections 6 and 7.  Peter's treatment of 
datatyping for literals is elegant and powerful, and I think will be 
a useful input to our discussions. However, see my recent 
cross-posting to the DAML JC and the RDFcoreWG on this thread for a 
slight revision that I think would fit somewhat easier with the 
current MT.

-Pat
-----------------------------------------------------------
I didn't account for blank nodes in the model theory in a way that is easy
to define a decent notion of entailment, so I put together a revision of
the model theory.   The development of interpretations and models is now
similar to yours.

I believe that entailment has

	G |= G'

for G' isomorphic to G, which would have been somewhat difficult to achieve
in the previous model theory.

peter


	Alternative Model Theory for RDF and RDF Schema plus Datatypes

Here is an alternative model theory for RDF and RDF Schema.   The attempt
here is to have a model theory for RDF and RDF Schema that can be extended
to datatypes, as I have proposed them.  This is a VERY terse document,
missing most of what would correspond with Pat Hayes's excellent commentary
on his model theory for RDF and RDF Schema.

Note that this model theory has not gone through any screening.  I have
tried to make it as error-free as possible, but there are undoubtably
errors remaining.  In particular, there may be (little) pieces missing
concerning some of the RDF and RDF Schema vocabulary.

Note also that this is an unofficial and draft model theory.

Peter F. Patel-Schneider



1/ RDF Graph Syntax

   I'm sticking with graphs, even though they have some problems for RDF.
   Triples have their own problems with respect to RDF.

URI is a collection of URI names.

   URI may be the collection of all URI names, but this is not required.
   This model theory ignores all aspects of the structure of URIs.

L is the collection of literals, disjoint from URI.

   Literals form the lexical space, in XML Schema datatype terms, not the
   value space!

An untidy RDF graph, R, is a three-tuple (that can be considered to be a
partially node labeled, directed triple-graph)
		< N, E, LN >
where N is the set of nodes in the graph
       LN :(partial) N -> URI u L gives labels for nodes
       E <= N' x N'' x N is the set of edges in the graph
	where N' = { n : LN(n) is undefined or LN(n) in URI }
	where N'' = { n : LN(n) is defined and LN(n) in URI }

<<I don't like having arcs be nodes. Peter agrees this is only a 
stylistic difference..>>

   This accounts for literals not being allowed as ``labels'' of edges, nor
   as the labels of nodes that are heads of edges, but does not account for
   edge ``labels'' being properties.

An untidy RDF graph is ground if LN is a total function on N.

A tidy RDF graph (also called an RDF graph) is an untidy RDF graph where LN
is injective on URI (but not necessarily total).

   Tidy graphs here do not have to be tidy on literals, which is change from
   Pat Hayes's model theory.

<<Peter is right, and I think that this is a better way to treat 
tidyness, ie to allow the same literal to occur on different nodes. 
It would be essential when there is data-typing, in any case.>>

   I have tried to keep as much of the terminology from Pat Hayes's model
   theory as possible.

The union of two untidy RDF graphs,
	R = < N , E , LN > and R' = < N', E', LN' >
is the untidy RDF graph
	R u R' = < N u N', E u E', LN u LN' >
provided that LN and LN' agree where they are both defined.

   Note that N and N' need not be disjoint.

<<true, and they should not be required to be..>>

The tidy version of an untidy RDF graph R = < N, E, LN >
is the graph t(R) = < N/eLN, E/eLN, LN/eLN >
where eLN is the equivalence relation induced by LN range restricted to URI
i.e., n = n'  iff  LN(n) and LN(n') both defined, LN(n) in URI,
		   and LN(n) = LN(n')
and / is point congruence.

	[Is this the right terminology?]

<<God alone knows>>

2/ Literal Values

LV is some collection of literal values.

   Literal values form the value space, in XML Schema datatype terms, not
   the lexical space!

XLS : L -> powerset ( LV ), maps literals into the set of literal values
			    that they might have.

   Here is the first substantive difference from Pat Hayes's model theory.
   The XLS mapping does not provide a definitive answer for the meaning of a
   literal.  The reason for not pinning down the mapping for literals is to
   allow different mappings for different datatypes.  For example, a node
   with literal label 05 might be mapped into the integer 5 or the string "05".

<<OK, I agree that we have to allow this. See my cross-posted message.>>

3/ Models and Entailment

An interpretation I is a four-tuple
	< IR, IP, IEXT, IS >
where IR is a non-empty set, called resources
       IP <= IR, called properties
       IEXT : IP -> powerset ( IR x (IR u LV) )
       IS : URI -> IR

An interpretation I = < IR, IP, IEXT, IS>
is a model for an untidy RDF graph R = < N, E, LN >
if there is a mapping
       M : N -> IR u LV
such that for n, s, p, o in N
     1. if LN(n) defined and LN(n) in URI then M(n) = IS(LN(n))
     2. if LN(n) defined and LN(n) in L then M(n) in XLS(LN(n))
     3. if <s,p,o> is in E then M(p) in IP and <M(s),M(o)> in IEXT(M(p))

   This works for both ground and non-ground untidy RDF graphs.

<<I prefer to have interpretations apply to vocabularies than to 
graphs, for technical reasons.>>

   One minor difference between this model theory and Pat Hayes's is that
   unnamed nodes can denote resources or literal values, unless they appear
   in the subject position of an edge.  This could easily be changed to
   require that unnamed nodes only denote resources.

<<not a real difference as LV can intersect IR>>

An untidy RDF graph R entails another untidy RDF graph R' (R |= R')  iff
R u R' is defined and every model for R is also a model for R u R'.

   This definition of entailment admits the possiblity that the two graphs
   share nodes.  Will this have interesting consequences?

<<Right, it should allow this. The wording in the current MT document 
is misleading in this respect.l.>>

4/ Core RDF

By core RDF I mean RDF without reification or containers.

A core RDF graph is an untidy RDF graph that contains nodes with the
following labels:

	rdf:type
	rdf:Property

and an edge <t,t,P> where t is a node with label rdf:type
		    and P is a node with label rdf:Property

   Throughout this development untidy graphs are allowed.  Theorems about
   entailment will probably have to use tidy graphs, however.
<<yup..>>

A core RDF model for a core RDF graph R is a model I for R with the
following extra conditions

	1. x in IP  iff  <x,IS(rdf:Property)> in IEXT(IS(rdf:type))
	2. IEXT(IS(rdf:type)) <= IR x IR

A core RDF graph R RDF-entails another core RDF graph R'  iff
R u R' is defined and
every core RDF model for R is also a core RDF model for R u R'.



5/ RDFS

A core RDFS graph is a core RDF graph that contains nodes with the
following labels:

   rdf:type			[redundant from RDF]
   rdf:Property			[redundant from RDF]
   rdfs:Resource
   rdfs:Class
   rdfs:subClassOf
   rdfs:subPropertyOf
   rdfs:seeAlso
   rdfs:isDefinedBy
   rdfs:ConstraintResource
   rdfs:ConstraintProperty
   rdfs:range
   rdfs:domain
   rdfs:label
   rdfs:comment
   rdfs:Literal

and the following edges (actually edges between nodes with the following
labels):

   <rdfs:Resource,      rdf:type, rdfs:Class>
   <rdf:Property,       rdf:type, rdfs:Class>
   <rdfs:Class,	       rdf:type, rdfs:Class>		[redundant]
   <rdfs:Literal,       rdf:type, rdfs:Class>

   <rdf:type,           rdf:type, rdf:Property>		[redundant from RDF]
   <rdfs:subClassOf,    rdf:type, rdf:Property>
   <rdfs:subPropertyOf, rdf:type, rdf:Property>
   <rdfs:seeAlso,       rdf:type, rdf:Property>
   <rdfs:isDefinedBy,   rdf:type, rdf:Property>		[redundant]

   <rdfs:range,         rdf:type, rdfs:ConstraintProperty>
   <rdfs:domain,        rdf:type, rdfs:ConstraintProperty>

   <rdfs:Class,              rdfs:subClassOf, rdfs:Resource>
   <rdfs:ConstraintResource, rdfs:subClassOf, rdfs:Resource>
   <rdfs:ConstraintProperty, rdfs:subClassOf, rdfs:Resource>	[redundant]
   <rdfs:ConstraintProperty, rdfs:subClassOf, rdfs:ConstraintResource>

   <rdfs:isDefinedBy,   rdfs:subPropertyOf,   rdfs:seeAlso>

   <rdf:type,           rdfs:range,  rdfs:Class>
   <rdfs:subClassOf,    rdfs:domain, rdfs:Class>
   <rdfs:subClassOf,    rdfs:range,  rdfs:Class>
   <rdfs:subPropertyOf, rdfs:domain, rdf:Property>
   <rdfs:subPropertyOf, rdfs:range,  rdf:Property>
   <rdfs:seeAlso,       rdfs:range,  rdfs:Resource>
   <rdfs:isDefinedBy,   rdfs:range,  rdfs:Resource>	[redundant]
   <rdfs:range,	       rdfs:domain, rdf:Property>
   <rdfs:range,	       rdfs:range,  rdfs:Class>
   <rdfs:domain,	       rdfs:domain, rdf:Property>
   <rdfs:domain,	       rdfs:range,  rdfs:Class>
   <rdfs:label,	       rdfs:domain, rdfs:Resource>	[redundant]
   <rdfs:label,	       rdfs:range,  rdfs:Literal>
   <rdfs:comment,       rdfs:domain, rdfs:Resource>	[redundant]
   <rdfs:comment,       rdfs:range,  rdfs:Literal>


A core RDFS model for a core RDFS graph R is a core RDF model I for R with
the following extra conditions:

   x in IR  iff  <x,IS(rdfs:Resource)> in IEXT(IS(rdf:type))
   x in IP  iff  <x,IS(rdf:Property)> in IEXT(IS(rdf:type)) [redundant from RDF]

   if <x,y> in IEXT(IS(rdf:type)) and <y,z> in IEXT(IS(rdfs:subClassOf))
     then <x,z> in IEXT(IS(rdf:type))			[2.3.2]

   if <x,y> in IEXT(IS(rdfs:subClassOf)) and <y,z> in IEXT(IS(rdfs:subClassOf))
     then <x,z> in IEXT(IS(rdfs:subClassOf))		[2.3.2]

   if <x,y> in IEXT(r) and <r,s> in IEXT(IS(rdfs:subPropertyOf))
     then <x,y> in IEXT(s)				[2.3.3]

   if <x,y> in IEXT(IS(rdfs:subPropertyOf))
   and <y,z> in IEXT(IS(rdfs:subPropertyOf))
     then <x,z> in IEXT(IS(rdfs:subPropertyOf)		[2.3.3?]

   x in IP and <x,IS(rdfs:ConstraintResource)> in IEXT(IS(rdf:type))
     iff  <x,IS(rdfs:ConstraintProperty)> in IEXT(IS(rdf:type))	[3.1.2]

   for y in IR, if <x,y> in IEXT(p) and <p,c> in IEXT(IS(rdfs:range))
     then <y,c> in IEXT(IS(rdf:type))			[3.1.3]
   if <x,y> in IEXT(p) and <p,IS(rdfs:Literal)> in IEXT(IS(rdfs:range))
     then y in LV

     Yes, this last is a special case for rdfs:Literal, but so what!

   if <x,y> in IEXT(p) and <p,c> in IEXT(IS(rdfs:domain))
     then <x,c> in IEXT(IS(rdf:type))			[3.1.4]


A core RDFS graph R RDFS-entails another core RDFS graph R'  iff
R u R' is defined and
every core RDFS model for R is also a core RDFS model for R u R'.



6/ Datatypes (general version)

Datatypes add extra structure to literals and literal values.

A datatype theory is a four-tuple <LV,DT,DTC,DTS>
where LV is a collection of literal values
       DT is a collection URIs that are also datatypes
       DTC : DT -> powerset ( LV )
       DTS : DT -> ( L -> LV ), with DTS(d) potentially partial
			       and DTS(d)(L) <= DTC(d) for all d

DTC maps a datatype to its extension (or value space).
DTS maps a datatype to a partial map from literals (or lexical space) to
literal values (or value space).

    Each datatype provides at most one literal value for each literal via
    the DTS mapping.

Given a datatype theory <LV,DT,DTC,DTS>
define XLS(l) = { lv in LV : for some d in DT with DT(d) defined on l
			     lv = DT(d)(l) }

Given a datatype theory <LV,DT,DTC,DTS>
a datatype RDFS model for a core RDFS graph R is a core RDFS model I for R,
with the following extra conditions:

   if <s,p,l> is in E with LN(l) in L
   and <M(p),c> in IEXT(IS(rdfs:range))
     then for any node nc with LN(nc) in DT and M(nc) = c
	M(l) = DTS(LN(nc))(l)

   for v in LV, if <x,v> in IEXT(p) and <p,c> in IEXT(IS(rdfs:range))
     then for any node nc with LN(nc) in DT and M(nc) = c
	v in DTC(LN(nc))


   These conditions are rather complicated for semantic conditions, so some
   explanation is in order.  The first condition says that literals (n) that
   are objects of statements must denote according to any datatype range for
   the predicate (M(p)) of the statement.  The second condition says that
   literals values (y) that are in relationships must belong to the value
   space (DTC(LN(c))) of any range of the relationship.


A core RDFS graph R datatype-entails another core RDFS graph R'  iff
R u R' is defined and
every datatype RDFS model for R is also a datatype RDFS model for R u R'.



7/ Datatypes (for XML Schema datatypes)

A XML Schema datatype theory is a datatype theory <LV,DT,DTC,DTS>
where LV contains the value spaces of the primitive XML Schema datatypes
       DT is the subset of URIs consisting of (built-in or all) XML Schema
	 datatypes, distinguished either by their special names (e.g.,
	 xsd:integer) or by ``following'' them and finding an XML Schema
	 datatype expression
       DTC maps each d in DT to its value space
       DTS maps each d in DT to its map from lexical space to value space

Given an XML Schema datatype theory T
an XML Schema datatype RDFS model for a core RDFS graph R is
a datatype RDFS model for R over T.


A core RDFS graph R xsd-entails another core RDFS graph R'  iff
R u R' is defined and
every XML Schema datatype RDFS model for R is also an XML Schema datatype
RDFS model for R u R'.
Received on Wednesday, 10 October 2001 23:31:47 UTC