Bringing the new RDF Model Theory to DAML+OIL Peter F. Patel-Schneider Bell Labs Research (13 December 2001) [This is *mostly* a document concerning semantics. It does contain some syntax issues and some datatyping issues, of course.] This is a new way of looking at DAML+OIL that is more compatible with the new RDF model theory. I have tried to identify the tricky points below. I will identify this new view as SWOL (Semantic Web Ontology Language). The basic idea is to stay with the RDF approach of having the syntax of semantically-meaningful constructs like subclass also show up as relationships in interpretations. This is extended to what I think is a reasonable point that does not exhibit any paradoxes. However, the class-forming operators of SWOL do not get this treatment, as doing so for all of them, in particular complementOf, would result in paradoxes and an ill-formed model theory. The RDF constructs like subclass are also given a more-powerful meaning. This meaning is upward compatible with the RDF meaning but is more in keeping with DAML+OIL. Datatypes: A datatyping scheme is a collection of datatypes, DT. A datatype d in DT is a triple where Ld is its lexical space Vd is its value space LVd : Ld -> Vd is its lexical-to-value mapping Given a datatyping scheme, let L = union over d in DT of Ld, lexical values V = union over d in DT of Vd, data values LV = union over d in DT of LVd This datatyping method works best if there is a collection of primitive datatypes, like integer, and the range-restriction of LV to the value spaces of each of these datatypes is functional. The presence of datatypes where this is not true, like XML Schema union datatypes, does not cause severe problems, as long as one realizes that the SWOL type theory only restricts the result of the lexical-to-value map, not the actual map. Thus stating that the range of a property is integer union string does not turn sequences of digit characters into integers. However, the presence of datatypes with different lexical-to-value maps for the primitive datatypes, e.g., octal integers without any syntactic tag, causes severe problems. Lexical Issues: N is a collection of names (URIs). E is a collection of identifiers (blank node ids), disjoint from N. K is a collection of literal occurences, of the form where l is in L, d is in DT l:n where l is in L, n is an integer (or some other tag) The surface syntax for the second kind of literal occurence would be just the lexical part, but a tag is needed here for distinguishing purposes. Syntax: A SWOL knowledge base is a collection of statements. Some of these statements are RDF triples, some are not. It is possible to make all statements be in the form of triples, just not RDF triples, and I will take this approach here. The non-RDF parts the syntax mostly have to do with description constructors. I recommend that an XML syntax be developed for these constructors, something along the lines of ... However, I leave the syntax of these non-RDF constructs vague for now. Interpretations: A SWOL interpretation, I, over a datatyping scheme DT is a generalized simple RDFS interpretation, consisting of R, nonempty the domain of resources, disjoint from V P <= R, nonempty properties C <= R, nonempty classes EXT : P -> 2^(Rx(RuV)) property extensions CEXT : C -> 2^(RuV) class extensions I : N -> R mapping from names to denotation I : K -> V mapping from literal occurences to denotation An extended SWOL interpretation, I', is an interpretation with the following additional component A : E -> R mapping from blank nodes to denotation I' is said to extend I if they agree on R, P, C, EXT, CEXT, and I. (Blank nodes are only mapped onto resources for now. It might be possible to relax this restriction. Resources and values are disjoint.) Both have the following conditions. From RDF: CEXT(I(rdf:Property)) = P I(rdf:type) in P for x in R x in CEXT(y) iff in EXT(I(rdf:type)) [rdf:type only lines up with CEXT on resources, not data values From RDFS: CEXT(I(rdfs:Resource)) = R CEXT(I(rdfs:Class)) = C CEXT(I(rdfs:Literal)) = V I(rdfs:Resource), I(rdf:Property) in C\DT I(rdfs:Class), I(rdfs:Literal) in C\DT < I(rdfs:Class), I(rdfs:Resource) > in EXT(I(rdfs:subClassOf)) < I(rdfs:Property), I(rdfs:Resource) > in EXT(I(rdfs:subClassOf)) I(rdfs:subClassOf), I(rdfs:subPropertyOf) in P I(rdfs:domain), I(rdfs:range) in P in EXT(I(rdfs:subClassOf)) implies CEXT(x) <= CEXT(y) in EXT(IS(rdfs:subPropertyOf)) implies EXT(r) <= EXT(s) if in EXT(p) and in EXT(I(rdfs:domain)) then x in CEXT(c) if in EXT(p) and in EXT(I(rdfs:range)) then y in CEXT(c) < I(rdf:type), I(rdfs:Resource) > in EXT(I(rdfs:domain)) < I(rdf:type), I(rdfs:Class) > in EXT(I(rdfs:range)) < I(rdfs:subClassOf), I(rdfs:Class) > in EXT(I(rdfs:domain)) < I(rdfs:subClassOf), I(rdfs:Class) > in EXT(I(rdfs:range)) < I(rdfs:subPropertyOf), I(rdfs:Property) > in EXT(I(rdfs:domain)) < I(rdfs:subPropertyOf), I(rdfs:Property) > in EXT(I(rdfs:range)) < I(rdfs:domain), I(rdfs:Property) > in EXT(I(rdfs:domain)) < I(rdfs:domain), I(rdfs:Class) > in EXT(I(rdfs:range)) < I(rdfs:range), I(rdfs:Property) > in EXT(I(rdfs:domain)) < I(rdfs:range), I(rdfs:Class) > in EXT(I(rdfs:range)) For datatypes: DT <= C I(swol:Datatype) in C\DT CEXT(I(swol:Datatype)) = DT for d in DT CEXT(d) = Vd I() = LVd(l) I() in LV(l) For SWOL: for x,y in C\DT CEXT(x) <= CEXT(y) implies in EXT(I(rdfs:subClassOf)) [rdf:subClassOf only lines up with CEXT on non-datatypes] EXT(r) <= EXT(s) implies in EXT(IS(rdfs:subPropertyOf)) I(swol:Class) in C\DT I(swol:ObjectProperty), I(swol:DatatypeProperty) in C\DT I(swol:UniqueProperty), I(swol:UnambiguousProperty) in C\DT I(swol:TransitiveProperty) in C\DT in EXT(I(rdfs:subClassOf) in EXT(I(rdfs:subClassOf) in EXT(I(rdfs:subClassOf) in EXT(I(rdfs:subClassOf) in EXT(I(rdfs:subClassOf) in EXT(I(rdfs:subClassOf) x in CEXT(I(swol:Class)) iff CEXT(x) <= R x in CEXT(I(swol:ObjectProperty)) iff x in P and EXT(x) <= R x R x in CEXT(I(swol:DatatypeProperty)) iff x in P and EXT(x) <= R x V x in CEXT(I(swol:UniqueProperty)) iff x in P and EXT(x) is functional x in CEXT(I(swol:UnambiguousProperty)) iff x in CEXT(I(swol:ObjectProperty)) and converse EXT(x) is functional x in CEXT(I(swol:TransitiveProperty)) iff x in CEXT(I(swol:ObjectProperty)) and EXT(x) o EXT(x) <= EXT(x) I(swol:sameClassAs), I(swol:disjointWith) in P I(swol:samePropertyAs) in P I(swol:sameIndividualAs), I(swol:differentIndividualFrom) in P in EXT(I(rdfs:subPropertyOf) in EXT(I(rdfs:subPropertyOf) in EXT(I(rdfs:domain)) in EXT(I(rdfs:range)) in EXT(I(rdfs:domain)) in EXT(I(rdfs:range)) in EXT(I(rdfs:domain)) in EXT(I(rdfs:range)) in EXT(I(rdfs:domain)) in EXT(I(rdfs:range)) in EXT(I(rdfs:domain)) in EXT(I(rdfs:range)) in EXT(I(rdfs:subClassOf)) iff x,y in C\DT and CEXT(x) <= CEXT(y) in EXT(I(swol:sameClassAs)) iff x,y in C\DT and CEXT(x) = CEXT(y) in EXT(I(swol:disjointWith)) iff x,y in C\DT and CEXT(x)^CEXT(y) = {} in EXT(I(rdfs:subPropertyOf)) iff x,y in P and EXT(x) <= EXT(y) in EXT(I(swol:samePropertyAs)) iff x,y in P and EXT(x) = EXT(y) in EXT(I(swol:sameIndividualAs)) iff x,y in R and x=y in EXT(I(swol:differentIndividualFrom)) iff x,y in R and x/=y Satisfaction: An extended interpretation SWOL-satisfies statements as follows. Note that some statements produce multiple conditions, e.g., S rdfs:subClassOf O. Statement Condition S P O [an RDF triple] < IO(S), IO(O) > in EXT(I(P)) [as in RDF MT] S rdf:type D IO(S) in ID(D) D1 rdfs:subClassOf D2 ID(D1) <= ID(D2) D1 swol:sameClassAs D2 ID(D1) = ID(D2) D1 swol:disjointWith D2 ID(D1) disjoint from ID(D2) P1 rdfs:subPropertyOf P2 EXT(I(P1)) <= EXT(I(P2)) P1 swol:samePropertyAs P2 EXT(I(P1)) = EXT(I(P2)) P rdfs:domain D EXT(I(P)) <= ID(D) x (RuV) P rdfs:range D EXT(I(P)) <= R x ID(D) where S P O is an RDF triple P, P1, P2 are names S, S1, S2 are names or blank node identifiers D, D1, D2 are descriptions (see below) Q, Q1, Q2 are roles (see below) O is a name or blank node identifier or literal occurence and IO : O -> R v V as follows Construction Mapping S a name I(S) l a literal occurence I(l) b a blank node A(b) and ID : D -> 2^R as follows [ID does not allow non-resources in extensions] Construction Extension S CEXT(I(S)) ^ R Thing R Nothing { } unionOf D1...Dn ID(D1) v ... v ID(Dn) intersectionOf D1...Dn ID(D1) ^ ... ^ ID(Dn) complementOf D R \ ID(D) oneOf S1 ... Sn { I(S1), ..., I(Sn) } toClass Q D { x : in IR(Q) implies y in IC(D) } hasValue Q O { x : in IR(Q) } hasClass Q D { x : exists y in IR(Q) and y in IC(D) } minCardinality n Q { x : >=n y in IR(Q) } maxCardinality n Q { x : <=n y in IR(Q) } cardinality n Q { x : exactly n in IR(Q) } minCardinalityQ n Q D { x : >=n y in IR(Q) and y in IC(D) } maxCardinalityQ n Q C { x : <=n y in IR(Q) and y in IC(D) } cardinalityQ n Q C { x : exactly n in IR(Q) and y in IC(D) } and IC : D -> 2^(RuV) as follows Construction Extension S CEXT(I(S)) D (except S) ID(D) and IR : Q -> 2^(Rx(RuV)) as follows Construction Extension P EXT(I(P)) if I(P) in CEXT(swol:ObjectProperty) EXT(I(P)) if I(P) in CEXT(swol:DatatypeProperty) [otherwise the entire statement is not satisfied] inverseOf P converse of I(P) if I(P) in CEXT(swol:ObjectProperty) [otherwise the entire statement is not satisfied] Models and entailment: An extended interpretation SWOL-satisfies a knowledge base if it SWOL-satisfies every statement in the knowledge base. An interpretation is a model for a SWOL knowledge base if there is some extension of the interpretation that satisfies the knowledge base. A SWOL knowledge base, KB1, entails another, KB2, if all models of KB1 are also models of KB2. Theorem (to be proved): Let KB1 and KB2 be SWOL knowledge bases. Let KB1- and KB2- be the RDF triples in them. If KB1- RDFS entails KB2- then KB1 entails KB2. Status of all RDF, RDFS, and ``old'' DAML-OIL constructs not handled above: Surface syntax - does not show up at this level xmlns:* rdf:aboutEach rdf:aboutEachPrefix rdf:li rdf:parseType rdf:RDF rdf:Description rdf:ID rdf:about rdf:resource Ontology versionInfo imports Obsolete surface syntax - not needed rdf:parseType of daml:collection daml:List daml:nil daml:first daml:rest daml:item Constructs with no special treatment needed (more or less) rdfs:label rdfs:comment rdf:value rdfs:seeAlso rdfs:isDefinedBy Unneeded description syntax daml:Restriction daml:onProperty daml:hasClassQ Not handled (yet) daml:disjointUnionOf Problematic Constructs RDF reification - rdf:subject, rdf:predicate, rdf:object, rdf:Statement - rdf:bagID - what does it mean? RDF containers - rdfs:Container, rdf:Seq, rdf:Bag, rdf:Alt, rdf:_n - what do they mean? daml:equivalentTo - what does it mean?