The Semantic Web Ontology Language (SWOL)

		Peter F. Patel-Schneider
		Bell Labs Research

		(18 December 2001)


1. Introduction

This is a concise definition of a revision to DAML+OIL, tentatively named
the Semantic Web Ontology Language (SWOL).

There are three basic changes from DAML+OIL to SWOL.
1/ The semantics of SWOL are consistent with the new model theory for RDF.
2/ The syntax of SWOL is expressed in terms of the XQuery 1.0 and XPath 2.0
   Data Model (henceforth Data Model).
3/ The treatment of datatypes has been simplified.

The basic idea behind the semantic changes is to stay with the RDF approach
of having the syntax of semantically-meaningful constructs like subclass
also show up as relationships in interpretations.  This cannot, however, be
extended to the entire language, as that would produce semantic paradoxes.
I have limited the number of constructs that show up as relationships in
interpretations to the point that I am reasonably certain that no semantic
paradoxes are present.

The basic idea behind the syntactic changes is to take advantage of XQuery
processing.


2. Datatypes

A datatyping scheme is a collection of datatypes, DT.
For each datatype d in DT there are four components:
	U(d), URI for the datatype;
	L(d), the lexical space for the datatype;
	V(d), the value space for the datatype; and
	LV(d) : L(d) -> V(d), the lexical-to-value mapping for the datatype.
Given a datatyping scheme, let L = union over d in DT of L(d), lexical values
			       V = union over d in DT of V(d), data values
			       LV = union over d in DT of LV(d)

This datatyping method works best if there is a collection of primitive
datatypes, like integer, and the range-restriction of LV to the value
spaces of each of these datatypes is functional.  The presence of datatypes
where this is not true, like XML Schema union datatypes, does not cause
severe problems, as long as one realizes that the SWOL type theory only
restricts the result of the lexical-to-value map, not the actual map.  Thus
stating that the range of a property is integer union string does not turn
sequences of digit characters into integers.  However, the presence of
datatypes with different lexical-to-value maps for the primitive datatypes,
e.g., octal integers without any syntactic tag, causes severe problems.

As syntax is XQuery Data Models the use of a standard implementation of
XQuery will provide almost all of the support for full XML Schema
datatypes.  If a simpler datatyping scheme is desired, XML Schema built-in
datatypes are an easy-to-implement candidate.  It is also possible to use
any other datatyping scheme that satisfies the above definition.

Aside from changing the actual datatyping scheme it would also be possible
to require that all text nodes be typed.  This would not appreciably
simplify the syntax and semantics below, but might simplify
implementations.  It would also be possible to modify the treatment of
untyped text nodes---requiring the denotation of an untyped text 
node to be in the value space of a special text datatype, different and
disjoint from all other datatypes.


3. Syntax

The process of creating the Data Model nodes that are used as input here is
outside the scope of this document.  However, it is expected that the
normal method of creation will start with one or more XML documents and
proceed through XML parsing and XML Schema validation to produce one or
more Data Model documents.  These documents would then be analyzed to
determine which other XML documents are needed, potentially requiring one
or more extra rounds of parsing, validation, and analysis.  Finally all
non-relevant information, such as document nodes, would be removed.  The
final result of this pre-processing is a SWOL knowledge base (KB), in the
form of an unordered collection of Data Model fragments, each of which has
an element node as its root.  

The set of nodes in a SWOL KB is given as K.

Note: Throughout this document Data Model nodes will be given as if they
      were concrete data types with positional arguments.  This is not how
      they are really defined, but makes the syntax much easier to present.

The syntax of SWOL is then defined on these nodes.  The only interesting
and relevant nodes are ELEMENT nodes, which have a name, attributes, and
children; ATTRIBUTE nodes, which have a name, a text value, and an optional
type; and TEXT nodes, which have a text value and an optional type.  Each
non-root node has a implicit parent, which is shown as |parent|.  The node
itself is shown as |self|.  Each node also has an expansion mapping for
qualified names, Q, that turns unexpanded qualified names, such as rdf:ID,
into expanded qualified names, such as
{http://www.w3.org/1999/02/22-rdf-syntax-ns,ID}.  The set of expanded
qualified names is N.

Unfortunately, XML is lacking in syntactic constructs, and thus the
non-RDF parts of the syntax have to be distinguished by the presence of
reserved words.  As the syntactical constructions that use these reserved
words often look like RDF constructs, this makes the grammar ambiguous.
The intent, although it is not formally specified, is that the productions
that involve descriptions have precedence over those that don't.  Also,
productions for RDF syntactical attributes (rdf:ID, rdf:about, and
rdf:resource) have precedence over all other productions. 

The actual syntax constructions in SWOL are defined together with their
semantic conditions in the section on satisfaction.


4. Interpretations

A SWOL interpretation, I, over a datatyping scheme DT is a
generalized simple RDFS interpretation, 
consisting of R, nonempty 		the domain of resources, disjoint from V
	      P <= R, nonempty		properties
	      C <= R, nonempty		classes	      
	      EXT : P -> 2^(Rx(RuV))	property extensions
	      CEXT : C -> 2^(RuV)	class extensions
	      S : N -> R		mapping from names to denotation

The following conditions must be satisfied by an interpretation.

4.1 Conditions from RDF

These conditions are taken (almost) directly from the RDF model theory.

R1  CEXT(S(rdf:Property))  =   P
R2  S(rdf:type)            in  P
R3  for x in R  x in CEXT(y)  iff  <x,y> in EXT(S(rdf:type))

Note that rdf:type only lines up with CEXT on resources, not data values,
so that data values do not have to be in the domain of rdf:type.

4.2 Conditions from RDFS  

These conditions are taken (almost) directly from the RDF model theory.

S1  CEXT(S(rdfs:Resource))  =  R
S2  CEXT(S(rdfs:Class))	    =  C
S3  CEXT(S(rdfs:Literal))   =  V

S4  C\DT contains S(rdfs:Resource), S(rdf:Property), 
		  S(rdfs:Class), S(rdfs:Literal)
S5  P contains S(rdfs:subClassOf), S(rdfs:subPropertyOf)
	       S(rdfs:domain), S(rdfs:range)
S6  EXT(S(rdfs:subClassOf)) contains < S(rdfs:Class), S(rdfs:Resource) >,
				     < S(rdfs:Property), S(rdfs:Resource) >
S7  EXT(S(rdfs:domain)) contains  < S(rdf:type), S(rdfs:Resource) >
				  < S(rdfs:subClassOf), S(rdfs:Class) >
				  < S(rdfs:subPropertyOf), S(rdfs:Property) >
				  < S(rdfs:domain), S(rdfs:Property) >
				  < S(rdfs:range), S(rdfs:Property) >
S8  EXT(S(rdfs:range))  contains  < S(rdf:type), S(rdfs:Class) >
				  < S(rdfs:subClassOf), S(rdfs:Class) >
				  < S(rdfs:subPropertyOf), S(rdfs:Property) >
				  < S(rdfs:domain), S(rdfs:Class) >
				  < S(rdfs:range), S(rdfs:Class) >

S9  <x,y> in EXT(S(rdfs:subClassOf))       implies   CEXT(x) <= CEXT(y)
S10 <r,s> in EXT(IS(rdfs:subPropertyOf))   implies   EXT(r)  <= EXT(s)

S11 if <x,y> in EXT(p) and <p,c> in EXT(S(rdfs:domain)) then x in CEXT(c)
S12 if <x,y> in EXT(p) and <p,c> in EXT(S(rdfs:range))  then y in CEXT(c)

4.3 Conditions for datatypes

D1  DT <= C
D2  S(Ud) = d   for d in DT
D3  S(swol:Datatype)    in   C\DT
D4  CEXT(S(swol:Datatype)) = DT
D5  for d in DT    CEXT(d) = Vd

4.4 Conditions for SWOL

W1  C\DT contains S(swol:Class)
		  S(swol:ObjectProperty), S(swol:DatatypeProperty),
		  S(swol:UniqueProperty), S(swol:UnambiguousProperty)
		  S(swol:TransitiveProperty)
W2  P contains S(swol:sameClassAs), S(swol:disjointWith),
	       S(swol:samePropertyAs),
	       S(swol:sameIndividualAs), S(swol:differentIndividualFrom)
W3  EXT(S(rdfs:subClassOf) contains 
		<S(swol:Class), S(rdfs:Class)>
		<S(swol:ObjectProperty), S(rdf:Property)>
		<S(swol:DatatypeProperty),   S(rdf:Property)>
		<S(swol:UniqueProperty),     S(rdf:Property)>
		<S(swol:UnambiguousProperty),S(swol:ObjectProperty)>
		<S(swol:TransitiveProperty), S(swol:ObjectProperty)>
W4  EXT(S(rdfs:subPropertyOf) contains
		<S(swol:sameClassAs), S(rdfs:subClassOf)>
		<S(swol:samePropertyAs), S(rdfs:subPropertyOf)>
W5  EXT(S(rdfs:domain)) contains  
		<S(swol:sameClassAs),S(rdfs:Class)>,
		<S(swol:disjointWith),S(rdfs:Class)>
		<S(swol:samePropertyAs),S(rdf:Property)>
		<S(swol:sameIndividualAs),S(rdfs:Resource)>
		<S(swol:differentIndividualFrom),S(rdfs:Resource)>
W6  EXT(S(rdfs:range))  contains  
		<S(swol:sameClassAs),S(rdfs:Class)>,
		<S(swol:disjointWith),S(rdfs:Class)>
		<S(swol:samePropertyAs),S(rdf:Property)>
		<S(swol:sameIndividualAs),S(rdfs:Resource)>
		<S(swol:differentIndividualFrom),S(rdfs:Resource)>

W7  x in CEXT(S(swol:Class)) 		    =>  x in C and CEXT(x) <= R 
W8  x in CEXT(S(swol:ObjectProperty))	    =>  x in P and EXT(x) <= R x R
W9  x in CEXT(S(swol:DatatypeProperty))     =>  x in P and EXT(x) <= R x V
W10 x in CEXT(S(swol:UniqueProperty))	    =>  x in P and EXT(x) is functional
W11 x in CEXT(S(swol:UnambiguousProperty))  =>  x in CEXT(S(swol:ObjectProperty))
					        and converse EXT(x) is functional
W12 x in CEXT(S(swol:TransitiveProperty))   =>  x in CEXT(S(swol:ObjectProperty))
					        and EXT(x) o EXT(x) <= EXT(x)

W13 <x,y> in EXT(S(rdfs:subClassOf))     iff x,y in C\DT and CEXT(x) <= CEXT(y)
W14 <x,y> in EXT(S(swol:sameClassAs))    iff x,y in C\DT and CEXT(x) = CEXT(y)
W15 <x,y> in EXT(S(swol:disjointWith))   iff x,y in C\DT and CEXT(x)^CEXT(y) = {}
W16 <x,y> in EXT(S(rdfs:subPropertyOf))  => x,y in P and EXT(x) <= EXT(y)
W17 <x,y> in EXT(S(swol:samePropertyAs)) => x,y in P and EXT(x) = EXT(y)
W18 <x,y> in EXT(S(swol:sameIndividualAs))         iff  x,y in R and x=y
W19 <x,y> in EXT(S(swol:differentIndividualFrom))  iff  x,y in R and x/=y

Note that rdf:subClassOf only lines up with CEXT on non-datatypes.

4.5 Discussion

The definition of interpretations here is more complex than that in many
logical formalisms.  This is due to two reasons:
1/ the presence, from RDF and RDFS, of the meta-theory in the theory, and
2/ the large built-in vocabulary of RDFS and SWOL.

As interpretations become more complex the possibility that the semantics
is ill-formed increases drastically.
To reduce this possibility, several choices have been made:
1/ The description-forming constructs of description logics do not show up
   in interpretations.
2/ Several description-relating constructs of DAML+OIL have been given
   weaker meanings that might be expected.  In particular,
   rdfs:subPropertyOf and swol:samePropertyAs, as well as the various
   categories of properties are only given one-way defintions.


5. Satisfaction

Given a SWOL KB an extended interpretation, I', for KB is a SWOL
interpretation, I,  with the following extra component
	A : K -> R u V		mapping from nodes to denotation
with the condition that non-text nodes map into R and text nodes map into V.
I' is said to be an extension of the interpretation I.

An extended interpretation SWOL-satisfies a KB as follows:

Note:  The construction rdf:name refers to the QName with local part name
       and URI http://www.w3.org/1999/02/22-rdf-syntax-ns
       The construction rdfs:name refers to the QName with local part name
       and URI http://www.w3.org/[rdfs URI]
       The construction swol:name refers to the QName with local part name
       and URI http://www.w3.org/[swol URI]


5.1 Satisfaction for non-descriptions

Syntax		Semantic Conditions

kb ::= resourceElement*

resourceElement ::= 
     ELEMENT(name,{propertyAttribute*},{propertyElement*})
		A(|self|) in CEXT(S(name))

valueNode ::=
     TEXT(text,type)
		A(|self|) = LV(type)(text)
   | TEXT(text)
		A(|self|) in LV(text)

propertyAttribute ::=
     ATTRIBUTE(rdf:ID,id)
		A(|parent|) = S(Q(id))
   | ATTRIBUTE(rdf:about,id)
		A(|parent|) = S(Q(id))
   | ATTRIBUTE(name,text)
		< A(|parent|) , A(|self|) > in EXT(S(name))
		A(|self|) in LV(text)
   | ATTRIBUTE(name,text,type)
		< A(|parent|) , A(|self|) > in EXT(S(name))
		A(|self|) = LV(type)(text)

propertyElement ::=
     ELEMENT(rdf:type,{},{desc})
		A(|parent|) in ID(desc)
   | ELEMENT(rdfs:subClassOf,{},{desc})
		ID(|parent|) <= ID(desc)
   | ELEMENT(swol:sameClassAs,{},{desc})
		ID(|parent|) = ID(desc)
   | ELEMENT(swol:disjointFrom,{},{desc})
		ID(|parent|) ^ ID(desc) = {}
   | ELEMENT(rdfs:domain,{},{desc})
		IR(|parent|)  <= ID(desc) x (RuV)
   | ELEMENT(rdfs:range,{},{desc})
		IR(|parent|)  <= R x IC(desc)
   | ELEMENT(name,{rdf:resource,id},{})
		< A(|parent|),S(Q(id)) > in IR(name)
   | ELEMENT(name,{},{resourceElement})
		< A(|parent|),S(resourceElement) > in IR(name)
   | ELEMENT(name,{},{valueNode})
		< A(|parent|),A(valueNode) > in IR(name)

prop ::= resourceElement

obj ::= resourceElement
      | valueNode


5.1 Extensions for descriptions

Description	Extension(ID)

desc
 ::= ELEMENT(swol:Class,{ATTRIBUTE(rdf:about,id)})
		CEXT(S(Q(id))) ^ R	provided that S(Q(id)) not in DT
   | ELEMENT(swol:Thing)
		R
   | ELEMENT(swol:Nothing)
		{ }
   | ELEMENT(swol:unionOf,{desc+})
		ID(desc1) v ... v ID(descn)
   | ELEMENT(swol:intersectionOf,{desc+})
		ID(desc1) ^ ... ^ ID(descn)
   | ELEMENT(swol:complementOf,{desc})
		R \ ID(desc)
   | ELEMENT(swol:oneOf,{resourceElement*})
		{ A(resourceElement1), ..., A(resourceElementn) }
   | ela(swol:toClass,swol:property=prop,swol:class=class})
		{ x : <x,y> in EXT(prop) implies y in IC(class) }
   | ela(swol:hasValue,swol:property=prop,swol:value=obj})
		{ x : <x,obj> in EXT(prop) }
   | ela(swol:hasClass,swol:property=prop,swol:class=class})
		{ x : exists y <x,y> in EXT(prop) and y in IC(class) }
   | ela(swol:minCardinality,swol:property=prop,swol:count=int})
		{ x : >=int y  <x,y> in EXT(prop) }
   | ela(swol:maxCardinality,swol:property=prop,swol:count=int})
		{ x : <=int y  <x,y> in EXT(prop) }
   | ela(swol:cardinality,swol:property=prop,swol:count=int})
		{ x : =int y  <x,y> in EXT(prop) }
   | ela(swol:minCardinality,swol:property=prop,swol:count=int,swol:class=class})
		{ x : >=int y  <x,y> in EXT(prop) and y in IC(class) }
   | ela(swol:maxCardinality,swol:property=prop,swol:count=int,swol:class=class})
		{ x : <=int y  <x,y> in EXT(prop) and y in IC(class) }
   | ela(swol:cardinality,swol:property=prop,swol:count=int,swol:class=class})
		{ x : =int y  <x,y> in EXT(prop) and y in IC(class) }


Class		Extension(IC)

class
 ::= desc
		ID(desc)
   | ELEMENT(swol:Datatype,{ATTRIBUTE(rdf:about,id)})
		L(S(Q(id))))		provided that S(Q(id)) in DT

The construction ela(name,arg=category,...) is a shorthand for
	ELEMENT(name,{ATTRIBUTE(argi,id),
		      ATTRIBUTE(argi,text),
		      ATTRIBUTE(argi,text,type),
		      ...},
                {ELEMENT(argi,{ATTRIBUTE(rdf:resource,id)}),
		 ELEMENT(argi,{categoryi}),
		 ...})
where each arg=category shows up in exactly one of the five ways above
and no other attributes or children show up.  Also the id versions are only
for class and prop, and the text versions are only for obj and int.

The meaning and conditions for the categories and forms are:

Category Syntactic Form		   Meaning 	Semantic Conditions

prop	ATTRIBUTE(argi,id)	   S(Q(id))
	ATTRIBUTE(rdf:resource,id) S(Q(id))
	category		   A(category)
class	ATTRIBUTE(argi,id)	   id
	ATTRIBUTE(rdf:resource,id) id
	category		   category
obj	ATTRIBUTE(argi,text)	   A(|self|)	A(|self|) in LV(text)
	ATTRIBUTE(argi,text,value) A(|self|)	A(|self|) = LV(type)(text)
	category		   A(category)
int	ATTRIBUTE(argi,text)	   A(|self|)	A(|self|) = LV(int)(text)
	ATTRIBUTE(argi,text,value) A(|self|)	A(|self|) = LV(type)(text)
	category		   A(category)

An extended interpretation SWOL-satisfies a knowledge base if it
SWOL-satisfies every statement in the knowledge base.


6. Models and entailment:

An interpretation is a model for a SWOL knowledge base if there is some
extension of the interpretation that satisfies the knowledge base.

A SWOL knowledge base, KB1, entails another, KB2, if all models of KB1
are also models of KB2.

Theorem (to be proved):
Let KB1 and KB2 be SWOL knowledge bases.  Let KB1- and KB2- be the RDF 
triples in them.  If KB1- RDFS entails KB2- then KB1 entails KB2.


A1. References:

XQuery 1.0 and XPath 2.0 Data Model (W3C Working Draft 7 June 2001)
http://www.w3.org/TR/2001/WD-query-datamodel-2001-6-7/


A2. Status of all RDF, RDFS, and ``old'' DAML-OIL constructs not handled above:

  Surface syntax - does not show up at this level
	xmlns:* rdf:aboutEach rdf:aboutEachPrefix rdf:li rdf:parseType
	rdf:RDF rdf:Description rdf:ID rdf:about rdf:resource
	Ontology versionInfo imports

  Obsolete surface syntax - not needed
	rdf:parseType of daml:collection
	daml:List daml:nil daml:first daml:rest daml:item

  Constructs with no special treatment needed (more or less)
	rdfs:label rdfs:comment rdf:value rdfs:seeAlso rdfs:isDefinedBy

  Unneeded description syntax
	daml:Restriction daml:onProperty daml:hasClassQ

  Not handled (yet)
	daml:disjointUnionOf

  Problematic Constructs
	RDF reification - rdf:subject, rdf:predicate, rdf:object, rdf:Statement
			- rdf:bagID
			- what does it mean?
	RDF containers - rdfs:Container, rdf:Seq, rdf:Bag, rdf:Alt, rdf:_n
		       - what do they mean?
	daml:equivalentTo - what does it mean?