Datatypes for RDF Schema from Peter F. Patel-Schneider on 2001-10-09 (www-rdf-interest@w3.org from October 2001)

From: Peter F. Patel-Schneider <pfps@research.bell-labs.com>
Date: Tue, 09 Oct 2001 18:01:58 -0400
To: www-rdf-interest@w3.org
Message-Id: <20011009180158C.pfps@research.bell-labs.com>
	Alternative Model Theory for RDF and RDF Schema plus Datatypes

Here is an alternative model theory for RDF and RDF Schema.   The attempt
here is to have a model theory for RDF and RDF Schema that can be extended
to datatypes, as I have proposed them.  This is a VERY terse document,
missing most of what would correspond with Pat Hayes's excellent commentary
on his model theory for RDF and RDF Schema.

Note that this model theory has not gone through any screening.  I have
tried to make it as error-free as possible, but there are undoubtably
errors remaining.  In particular, there may be (little) pieces missing
concerning some of the RDF and RDF Schema vocabulary.

Note also that this is an unofficial and draft model theory.  

Peter F. Patel-Schneider



1/ RDF Graph Syntax

  I'm sticking with graphs, even though they have some problems for RDF.
  Triples have their own problems with respect to RDF.

URI is a collection of URI names.

  URI may be the collection of all URI names, but this is not required.
  This model theory ignores all aspects of the structure of URIs.

L is the collection of literals, disjoint from URI.

  Literals form the lexical space, in XML Schema datatype terms, not the
  value space!

An untidy RDF graph, R, is a three-tuple (that can be considered to be a
partially node labeled, directed triple-graph) 
		< N, E, LN >
where N is the set of nodes in the graph
      LN :(partial) N -> URI u L gives labels for nodes
      E <= N' x N'' x N is the set of edges in the graph
	where N' = { n : LN(n) is undefined or LN(n) in URI }
	where N'' = { n : LN(n) is defined and LN(n) in URI }

  This accounts for literals not being allowed as ``labels'' of edges, nor
  as the labels of nodes that are heads of edges, but does not account for
  edge ``labels'' being properties.

An untidy RDF graph is ground if LN is a total function on N.

A tidy RDF graph (also called an RDF graph) is an untidy RDF graph where LN
is injective on URI (but not necessarily total).

  Tidy graphs here do not have to be tidy on literals, which is change from
  Pat Hayes's model theory, but this change only has consequences below.

  I have tried to keep as much of the terminology from Pat Hayes's model
  theory as possible.  


2/ Literal Values

LV is some collection of literal values.

  Literal values form the value space, in XML Schema datatype terms, not
  the lexical space!

XLS : L -> powerset ( LV ), maps literals into the set of literal values
			    that they might have. 

  Here is the first substantive difference from Pat Hayes's model theory.
  The XLS mapping does not provide a definitive answer for the meaning of a
  literal.  The reason for not pinning down the mapping for literals is to
  allow different mappings for different datatypes.  For example, a node
  with literal label 05 might be mapped into the integer 5 or the string "05".


3/ Models

Let R = < N, E, LN > be an untidy RDF graph
  
A model I for R is a four-tuple 
	< IR, IP, IEXT, IS>
where IR is a non-empty set, called resources
      IP <= IR, called properties
      IEXT : IP -> powerset ( IR x (IR u LV) )
      IS : N -> IR u LV
such that for n, n', s, p, o in N
    1. if LN(n) in URI then IS(n) in IR 
    2. if LN(n) in URI and LN(n) = LN(n') then IS(n) = IS(n') 
    3. if LN(n) in L then IS(n) in XLS(LN(n)) 
    4. if <s,p,o> is in E then IS(p) in IP and <s,o> in IEXT(IS(p))

  This works for both ground and non-ground untidy RDF graphs.

  One minor difference between this model theory and Pat Hayes's is that
  unnamed nodes can denote resources or literal values, unless they appear
  in the subject position of an edge.  This could easily be changed to
  require that unnamed nodes only denote resources.

  One reason for moving to this method of defining models instead of Pat
  Hayes's is that his version does not work when the denotation of
  literal-named nodes is not fixed by XLS.

We exploit the uniqueness of URI mappings under IS and say that
   IS(LN(n)) = IS(n) for n in N with LN(n) in URI


4/ Core RDF Models

By core RDF I mean RDF without reification or containers.

A core RDF graph is a tidy RDF graph that contains nodes with the following
labels:

	rdf:type
	rdf:Property

and an edge  <IS(rdf:type),IS(rdf:type),IS(rdf:Property)>

  The reason to limit to tidy RDF graphs starting here is so that there is
  are single nodes for rdf:type and rdf:Property, but this restriction is
  not absolutely necessary.

A core RDF model for a core RDF graph R is a model I for R with the
following extra conditions 

	1. x in IP  iff  <x,IS(rdf:Property)> in IEXT(IS(rdf:type))
	2. IEXT(IS(rdf:type)) <= IR x IR


5/ RDFS Models

A core RDFS graph is a core RDF graph that contains nodes with the
following labels:

  rdf:type			[redundant from RDF]
  rdf:Property			[redundant from RDF]
  rdfs:Resource
  rdfs:Class
  rdfs:subClassOf
  rdfs:subPropertyOf
  rdfs:seeAlso
  rdfs:isDefinedBy
  rdfs:ConstraintResource
  rdfs:ConstraintProperty
  rdfs:range
  rdfs:domain
  rdfs:label
  rdfs:comment
  rdfs:Literal

and the following edges (being a little bit lazy in using labels to
identify nodes)

  <rdfs:Resource,      rdf:type, rdfs:Class>
  <rdf:Property,       rdf:type, rdfs:Class>
  <rdfs:Class,	       rdf:type, rdfs:Class>		[redundant]
  <rdfs:Literal,       rdf:type, rdfs:Class>

  <rdf:type,           rdf:type, rdf:Property>		[redundant from RDF]
  <rdfs:subClassOf,    rdf:type, rdf:Property>
  <rdfs:subPropertyOf, rdf:type, rdf:Property>
  <rdfs:seeAlso,       rdf:type, rdf:Property>
  <rdfs:isDefinedBy,   rdf:type, rdf:Property>		[redundant]

  <rdfs:range,         rdf:type, rdfs:ConstraintProperty>
  <rdfs:domain,        rdf:type, rdfs:ConstraintProperty>

  <rdfs:Class,              rdfs:subClassOf, rdfs:Resource>
  <rdfs:ConstraintResource, rdfs:subClassOf, rdfs:Resource>
  <rdfs:ConstraintProperty, rdfs:subClassOf, rdfs:Resource>	[redundant]
  <rdfs:ConstraintProperty, rdfs:subClassOf, rdfs:ConstraintResource>

  <rdfs:isDefinedBy,   rdfs:subPropertyOf,   rdfs:seeAlso>

  <rdf:type,           rdfs:range,  rdfs:Class>
  <rdfs:subClassOf,    rdfs:domain, rdfs:Class>
  <rdfs:subClassOf,    rdfs:range,  rdfs:Class>
  <rdfs:subPropertyOf, rdfs:domain, rdf:Property>
  <rdfs:subPropertyOf, rdfs:range,  rdf:Property>
  <rdfs:seeAlso,       rdfs:range,  rdfs:Resource>
  <rdfs:isDefinedBy,   rdfs:range,  rdfs:Resource>	[redundant]
  <rdfs:range,	       rdfs:domain, rdf:Property>
  <rdfs:range,	       rdfs:range,  rdfs:Class>
  <rdfs:domain,	       rdfs:domain, rdf:Property>
  <rdfs:domain,	       rdfs:range,  rdfs:Class>
  <rdfs:label,	       rdfs:domain, rdfs:Resource>	[redundant]
  <rdfs:label,	       rdfs:range,  rdfs:Literal>
  <rdfs:comment,       rdfs:domain, rdfs:Resource>	[redundant]
  <rdfs:comment,       rdfs:range,  rdfs:Literal>


A core RDFS model for a core RDFS graph R is a core RDF model I for R with
the following extra conditions:

  x in IR  iff  <x,IS(rdfs:Resource)> in IEXT(IS(rdf:type))
  x in IP  iff  <x,IS(rdf:Property)> in IEXT(IS(rdf:type)) [redundant from RDF]

  if <x,y> in IEXT(IS(rdf:type)) and <y,z> in IEXT(IS(rdfs:subClassOf))
    then <x,z> in IEXT(IS(rdf:type))			[2.3.2]

  if <x,y> in IEXT(IS(rdfs:subClassOf)) and <y,z> in IEXT(IS(rdfs:subClassOf))
    then <x,z> in IEXT(IS(rdfs:subClassOf))		[2.3.2]

  if <x,y> in IEXT(r) and <r,s> in IEXT(IS(rdfs:subPropertyOf))
    then <x,y> in IEXT(s)				[2.3.3]

  if <x,y> in IEXT(IS(rdfs:subPropertyOf)) 
  and <y,z> in IEXT(IS(rdfs:subPropertyOf))
    then <x,z> in IEXT(IS(rdfs:subPropertyOf)		[2.3.3?]

  x in IP and <x,IS(rdfs:ConstraintResource)> in IEXT(IS(rdf:type))
    iff  <x,IS(rdfs:ConstraintProperty)> in IEXT(IS(rdf:type))	[3.1.2]

  for y in IR, if <x,y> in IEXT(p) and <p,c> in IEXT(IS(rdfs:range))
    then <y,c> in IEXT(IS(rdf:type))			[3.1.3]
  if <x,y> in IEXT(p) and <p,IS(rdfs:Literal)> in IEXT(IS(rdfs:range)) 
    then y in LV

    Yes, this last is a special case for rdfs:Literal, but so what!

  if <x,y> in IEXT(p) and <p,c> in IEXT(IS(rdfs:domain))
    then <x,c> in IEXT(IS(rdf:type))			[3.1.4]



6/ Datatypes (general version)

Datatypes add extra structure to literals and literal values. 

A datatype theory is a four-tuple <LV,DT,DTC,DTS>
where LV is a collection of literal values
      DT is a collection URIs that are also datatypes
      DTC : DT -> powerset ( LV )
      DTS : DT -> ( L -> LV ), with DTS(d) potentially partial
			       and DTS(d)(L) <= DTC(d) for all d

DTC maps a datatype to its extension (or value space).
DTS maps a datatype to a partial map from literals (or lexical space) to
literal values (or value space).

   Each datatype provides at most one literal value for each literal via
   the DTS mapping.

Given a datatype theory <LV,DT,DTC,DTS>
define XLS(l) = { lv in LV : for some d in DT with DT(d) defined on l
			     lv = DT(d)(l) }

Given a datatype theory <LV,DT,DTC,DTS> 
a datatype RDFS model for a core RDFS graph R is a core RDFS model I for R,
with the following extra conditions:

  if <s,p,n> is in E with LN(n) in L
  and <IS(p),c> in IEXT(IS(rdfs:range))
    then for any node nc with LN(nc) in DT and IS(nc) = c 
	IS(n) = DTS(LN(nc))

  for y in LV, if <x,y> in IEXT(p) and <p,c> in IEXT(IS(rdfs:range))
    then for any node nc with LN(nc) in DT and IS(nc) = c 
	y = DTC(LN(nc))


  These conditions are rather complicated for semantic conditions, so some
  explanation is in orer.  The first condition says that literals (n) that
  are objects of statements must denote according to any datatype range for
  the predicate (IS(p)) of the statement.  The second condition says that
  literals values (y) that are in relationships must belong to the value
  space (DTC(LN(c))) of any range of the relationship.


7/ Datatypes (for XML Schema datatypes)

A XML Schema datatype theory is a datatype theory <LV,DT,DTC,DTS>
where LV contains the value spaces of the primitive XML Schema datatypes
      DT is the subset of URIs consisting of (built-in or all) XML Schema
	 datatypes, distinguished either by their special names (e.g.,
	 xsd:integer) or by ``following'' them and finding an XML Schema
	 datatype expression 
      DTC maps each d in DT to its value space
      DTS maps each d in DT to its map from lexical space to value space

Given an XML Schema datatype theory T
an XML Schema datatype RDFS model for a core RDFS graph R is
a datatype RDFS model for R over T.
Received on Tuesday, 9 October 2001 18:04:00 UTC