RDF Model Theory draft Proposal
Draft
8/28/01
Pat Hayes, IHMC
0. N-triples and RDF
graphs.
In what follows, ‘ the
M&S’ refers to the document “Resource Description
Framework (RDF) Model and Syntax
Specification” ( http://www.w3.org/TR/REC-rd
f-syntax
), ‘RDFS spec’ refers to the document “Resource Description
Framework (RDF) Schema Specification 1.0” (http://www.w3.
org/TR/2000/CR-rdf-schema-20000327
), and 'N-triples' refers to Dave
Beckett's version 5, relevant parts of which are:
ntripleDoc ::=line*
line ::=ws* (comment |
blankline | triple) eoln
triple ::=subject ws+
predicate ws+ object ws* '.' ws*
subject ::=uriref |
anonNode
predicate ::=uriref
object ::=uriref |
anonNode | qLiteral
We understand N-triples and
RDF-XML as lexical notations for describing an RDF graph as defined in the
M&S. An RDF graph is
a partially
labeled directed graph in which nodes are either anonymous (ie unlabelled), or
else labeled with URIs or qLiterals; arcs are labeled with URIs; and
distinct labelled
nodes have different labels. The model theory assigns interpretations directly to the
graph, which is taken as being the primary RDF syntax, in the sense that two
RDF documents, in whatever lexical form, are syntactically equivalent if and
only if they map to the same RDF graph. We will refer to this as the 'graph
syntax' to avoid ambiguity, since the bare term 'syntax' is often assumed to
refer to a lexicalization.
A graph which is like an RDF graph except that it has two or more
nodes with the same label, will be called an untidy graph. Untidy graphs may arise as a result of merging operations
or be intermediate states in various operations on graphs, but we will assume
that all RDF graphs are tidy unless otherwise noted. In graph syntax, an untidy
graph is considered to be syntactically
ill-formed.
An RDF graph will be
said to be ground if it contains no anonymous
nodes. The vocabulary of a graph is the set of URIs
that it contains.
A graph can be viewed as a set of triples corresponding to its
node-arc-node edges; the correspondence between an ntripleDoc and a tidy RDF
graph is that the graph has one node for each uriref, anonNode or qLiteral in
the document, and one edge for each Ntriple triple, directed from the node
labeled with the subject to the node labeled with the object and with the arc
labeled with the predicate. Notice that this requires that all
occurrences of a particular uriref
or anonNode label in an N-triples document be mapped to a
single node in the corresponding graph.
An <anonNode> expression in an ntripleDoc serves to identify
a particular anonymous node, but it is not considered to be a label of that
node. (These terms play a role in an ntripleDoc analogous to that played by
bound variables in a logical expression. They are local to the document and
serve only to indicate a ‘connection’ between other
expressions.) <comment>
Should probably say something about XML-RDF at this point.
</comment>
Other RDF lexicalisations may use other means of indicating the
graph structure; for our purposes, the important syntactic properties of RDF
graphs is that distinct nodes in an RDF graph are treated as distinct referring
entities in the graph syntax.
1. Interpretations
We assume that there is no restriction on what a
resource can be and
that there is no restriction on the domains and ranges of properties; in
particular, a property may be applied to itself. (When classes are introduced
in RDFS, we will assume that they can contain themselves. [Technical note. This
assumption will not violate set-theoretic well-foundedness.])
The fact that two sets are given distinct labels does not imply
that they are disjoint; we will explicitly state any disjointness or
containment conditions as they arise. In the same spirit, the fact that one set
is stated to be a subset of another should not be interpreted as
saying that these
sets cannot be identical, unless this is stated
explicitly.
First, we assume a global set LV of literal values and a fixed
mapping XL from the set of qLiterals to LV. All interpretations will be
required to conform to XL on Literal nodes.
This leaves open the question of the exact nature of literals,
their language-sensitivity and so on, while acknowledging their special status
as expressions with a 'fixed' interpretation.
All interpretations will be relative to a set of URIs,
called the vocabulary of the interpretation, so that one has to speak, strictly, of an
interpretation of an RDF 'nodespace' or of an RDF graph, rather than of RDF
itself. For a lexicalized version
of the language, we can think of the vocabulary of an interpretation, more
traditionally, to be a subset of the URI-indicating expressions used by that
lexicalization; eg for N-triples,
we can consider the vocabulary of I to be a set of <urirefs>.
We do not take any position here on the way that node labels may
be composed from other expressions, eg from relative URIs or Qnames; the model
theory simply assumes that such lexical issues have been resolved in some way
that is globally coherent, so that a single URI can be taken to have the same
meaning wherever it occurs. Similarly, the model theory given here has no
special provision for tracking temporal changes; it assumes, implicitly, that
URIs have the same meaning whenever
they occur. (If
one wishes to apply RDF in a context where the referents of URIs (or names more
generally) may change with time, then the current theory could be regarded as
defining a ‘snapshot’ of the meaning of a changing
representation.)
An interpretation I (of
vocab(I)) is defined by:
1. A nonempty set IR of resources, called the domain or universe
of I.
2. A subset IP of IR
corresponding to properties
3. A mapping IEXT from
IP into the powerset
of IRx(IR union LV) (ie the set of
sets of pairs <x,y> with x in IR and y in IR or
LV)
4. A mapping IS:
vocab(I) -> IR
<comment> IEXT(x) is a set of pairs, ie a binary relational
extension, called the extension of x. This trick of distinguishing a relation as an object from its
relational extension allows a property to occur in its own extension, a neat
trick I learned from Chris Menzel.
</comment>
The denotation I(E) of a ground RDF graph E
in I is given by the following rules:
>> if E is a <qLiteral> then I(E) =
XL(E)
>> if E is a <uriref> then I(E) = IS(E)
>> if E is an asserted triple with the form <s p o
>
then I(E) = true iff <Is),I(o)> is in IEXT(I(p)), otherwise I(E)=
false.
>> if E is a ground RDF graph then I(E) = false just in case
I(E') = false for some asserted triple E' in E, otherwise I(E) =
true.
The use of the phrase
"asserted triple" is a deliberate weasel-worded artifact, to allow an
RDF graph/document to contain triples which are being used for some
non-assertional purpose (such as expressing a query, implementing part of some
expression of another language, or forming part of a pattern intended to be
matched against some asserted RDF in another document, or just being breezy:-).
Strict conformity to the RDF 1.0 M&S assumes that all triples in a document
are asserted triples, but making the distinction allows RDF parsers and
inference engines to conform to the RDF syntax and to respect the RDF model
theory without necessarily being fully committed to it. This also allows the
model theory to be extended in future directions.
For example, for the vocabulary {a b c} the following is a small
interpretation, where we use integers to indicate the ‘things’ in
the universe:
IR= {1, 2}; IP = {1}
IEXT: 1->{<1,2>,<2,1>}
IS: a->1,b->1,c->2
which makes these triples true:
a b c
c a a
c b a
and these triples false:
a c b
a b b
c a c
For example, I(<a b c>) = true if <I(a),I(c)> is in
IEXT(I(b)), ie if <1,2> is in IEXT(1), which is {<1,2>,<2,1>}
and so does contain <1,2> and so I(<a bc>) is true.
I(<a c b>) = true if <I(a),I(c)>, ie <1,2> is in
IEXT(I(c)); but I(c)=2 and IEXT is not defined on 2, so the condition fails and
I(<a c b>) = false.
2. Anonymous nodes as
existential assertions
We could treat anonymous nodes exactly like URIs, semantically
speaking, by extending the IS mapping to include them as well as URIs. That
would amount to adopting the view that an anonymous node is equivalent to a
node with an unknown label.
However, it seems to be more in conformance with the M&S to treat
anonymous nodes as existentially bound variables. This will require some
definitions, as the theory so far provides no meaning for anonymous
nodes.
Suppose I is an interpretation and A is a mapping from some set of
anonymous nodes to the domain of I, and define I+A to be an extended
interpretation which is like I except that it uses A to give the interpretation
of anonymous nodes. Define anon(E)
to be the set of anonymous nodes in E.
Then we can extend the above rules to include the two new cases that are
introduced when anonymous nodes occur in the
graph:
>> If E is an anonymous node than [I+A](E) =
A(E)
>> If E is an RDF graph then I(E) = true if [I+B](E) = true
for some B defined on anon(E), otherwise I(E)= false.
This effectively treats all anonymous nodes as existentially
quantified in the RDF graph in which they occur. Notice that since two nodes
cannot have the same label , there is no need to specify the 'scope' of the
quantifier within a graph. (However, it
is local to the graph.) If we were to apply the semantics directly
to N-triples syntax, we would need to indicate the quantifier scope, since in
this lexicalisation syntax the same anonNode identifier may occur several times. The above rule amounts to the
N-triple convention that would place the quantifiers just outside, ie at the outer edge of, the
<ntripleDoc>ument corresponding to the graph.
Notice that we have not changed the definition of an
interpretation.
The same interpretation that provides a truth-value for ground graphs also
assigns truth-values to graphs with anonymous nodes, even though it provides no
interpretation for the anonymous nodes themselves. Notice also that the anonymous nodes themselves are, qua
nodes, perfectly well-defined entities with a robust notion of identity; they
differ from other nodes only in not being assigned a direct model-theoretic
interpretation, which means that they have no ‘global’ meaning (ie
outside the graph in which they occur).
2a. Comparison with formal logic
( Readers unfamiliar with formal logic can skip this
section. )
With this semantics, an RDF graph can be translated directly into
a logical expression with the same meaning.
Each node labeled with a URI translates to a logical constant
which is its label ; each anonymous node to a distinct variable. Each arc
(triple) labeled with <s p o> maps to an atomic assertion of the
form p(s, o) (or (p s o) in KIF
syntax); a graph translates to the the existential closure of the conjunction
of all the arcs in the graph.
For example, the graph defined by the
triples
_:x a b
c b _:x
_:x a c
translates to the logical expression (written in KIF
syntax)
(exists (?y)(and (a ?y b)(b c ?y)(a c
?y)))
Notice that the resulting expression may contain the same symbol
in both relation and object positions (eg ‘b’ in this example).
This is considered syntactically illegal in many versions of logic, but is
acceptable in ISO-KIF (Hayes & Menzel 2001). To map to a more conventional
logical syntax, translate the triple <s p o> into the
form
(holds p s o)
where ‘holds’ is a three-place relation used to assert
that a binary relation holds between two arguments. The above example would
then map to the expression:
(exists (?y)(and (holds a ?y b)(holds b c ?y)(holds a c ?y)))
3. RDFS
interpretations
RDF Schema (RDFS) introduces a
special RDF vocabulary, with some
fixed constraints on its interpretation; in particular, the notion of a
'class'. We can define all of RDFS
in terms of a single RDF property rdf:type
which relates entities to the classes which contain them. Following
conventional usage, we understand a classification into a ‘type’ to
mean membership in a set; the sets will be considered to be the extensions of
resources called ‘classes’, using the same technique used earlier
to relate properties to their extensions. RDFS provides names for several of
the classes whose extensions we have already used in defining the model theory
(IR, LV, IP) and a general naming scheme to allow user-defined classes. The one
new class is the class of all classes, whose extension – the set of all
classes – will be called IC.
It will be convenient to define a mapping ICEXT (for the Class Extension
in I) from classes to their extensions, in terms of the relational extension of
rdf:type, as follows:
ICEXT(x) = {y | <y,x> is
in IEXT(I(rdf:type)) }
Ie the class extension of x is
the set of things that are related to x by the property rdf:type, ie which have the class x as
their ‘type’. This definition is re-stated below as the semantic
rule for ‘rdf:type’.
An rdfs interpretation is then
an RDF interpretation plus one extra subset of the universe, forming a fifth
condition on an interpretation:
5. A subset IC of IR,
containing classes
[Technical comment. This use of
a class extension mapping allows classes to contain themselves. For example, it
is quite OK for a 'universal' class to contain itself as a member, a convention
that is often adopted at the top
of a classification hierarchy.
This convention is often thought unacceptable since it seems to violate
one of the axioms of standard (Zermelo-Fraenkel) set theory, the axiom of
foundation, which forbids infinitely descending chains of membership. However, this impression is mistaken.
The semantic model given here distinguishes between the class as an object and
its class extension - the set of things that are ‘in’ the class
- thereby allowing the extension
of a class to contain the class itself without violating the axiom of
foundation. (If an extension contained itself then the axiom would be
violated, but that case never arises.) In particular, the class of all classes
– here, I(rdfs:Class) – may be in the set of all classes –
ICEXT(I(rdfs:Class)) – which would make the triple <rdfs:Class
rdf:type rdfs:Class> true in the interpretation I.
Notice that the question of
whether or not a class contains itself as a member is quite different from the
question of whether or not it is a subclass of itself. ]
An rdfs interpretation I satisfies the following conditions, in
addition to those for an rdf interpretation given earlier. (The complete model
theory is summarized in a table in an appendix.)
>>
ICEXT(I(rdfs:Resource)) =
IR
>> ICEXT(I(rdf:Property))
= IP
>> ICEXT(I(rdfs:Class))
= IC
>>
ICEXT(I(rdfs:Literal)) =
LV
>> <x,y> is in
IEXT(I(rdf:type)) iff x is in ICEXT(y)
>> <x,y> is in
IEXT(I(rdfs:subClassOf)) iff ICEXT(x) is a proper subset of ICEXT(y)
>> <x,y> is in
IEXT(rdfs:subPropertyOf)) iff IEXT(x) is a proper subset of
IEXT(y)
<comment> The subsets
referred to in subPropertyOf are sets of pairs.
</comment>
<comment> The M&S
seems to imply that subClass and subProperty extensions are
proper subsets, ie the identity case
is disallowed, and hence 'subclass loops' are illegal. If we decide to change
this to allow 'subclass loops', as
assumed by DAML+OIL, then "proper" should be deleted from these
conditions. </comment>
>>
I(rdfs:ConstraintResource) is in IC
>>
ICEXT(I(rdfs:ConstraintProperty)) is the intersection of
ICEXT(I(rdfs:ConstraintResource)) and IP  
; &nb
sp; &
nbsp;  
; &nb
sp;
>> I(rdf:range) and
I(rdf:domain) are in
ICEXT(I(rdfs:ConstraintProperty))
>> If <x,y> is in
IEXT(I(rdf:range)) and <u,v>
is in IEXT(x) then v is in ICEXT(y)
>> If <x,y> is in
IEXT(I(rdf:domain)) and
<u,v> is in IEXT(x) then u is in ICEXT(y)
<comment> This reflects
our current understanding of multiple domain and range restrictions rather than
the wording in the M&S. </comment>
(The model theory imposes no
conditions on the interpretations of rdfs:seeAlso, rdfs:isDefinedBy,
rdfs:comment or rdfs:label.)
4. Reification
RDF uses rdfs to define two
important extensions to the basic language: reification (which allows RDF to
make assertions about RDF statements) and the introduction of classes of
containers. These impose extra
conditions on interpretations, requiring their domains to include particular
classes of structures.
We will say that an
interpretation I reifies a set of nodes V if it is capable of describing all RDF triples
that can be constructed from those nodes. (The RDF M&S does not provide any
vocabulary for describing graphs.) When considering lexical forms of the
language, it would be natural to think of V as a set of lexical labels which
identify nodes, eg a subset of( <uriref> union <qLiteral> union
<anonNode>) in N-triples. For lexicalisations such as XML-RDF which do
not provide lexical identifiers corresponding to all anonymous nodes, V must be
allowed to contain arbitrary numbers of ‘blank’ node tokens as well
as URIs and literals.)
In order to be reified, the
interpretation must contain in its universe entities corresponding to all the
nodes (or names) in V and all the
triples that can be constructed from them. The reification conditions can then
be stated as follows:
>> (V union VxVxV ) is a
subset of IR
>> <x,y> is in
IEXT(I(rdf:subject)) iff for some a, b, and c in V, x=<a b c>and y= a
>> <x,y> is in
IEXT(I(rdf:predicate)) iff for some a, b and c in V, x=<a b
c>and y= b
>> <x,y> is in
IEXT(I(rdf:object)) iff for some a, b and c in V, x=<a b c>and
y= c
>> x is in ICEXT(I(rdf:Statement)) iff for
some a, b and c in V, x=<a b c>
Reification is fairly simple,
in this account. However, notice that having x (or REIF(x) ) in the domain does
not in itself provide any access to the meaning of x in the interpretation; in
particular, it provides no way to assert x. This is illustrated by fact that
none of the above equations mention I(x) when x is in IR. The language provides
no way to *use* (as opposed to mention) a reified triple. Adding such an ability (sometimes
called ‘downward reflection’ or ‘dequoting’) would
considerably increase the expressive power of RDF; it would in effect provide
the language with a form of truth predicate. On the other hand, such devices
need to be treated with considerable care in order to avoid self-referential
paradoxes.
5.
Containers
RDFS provides a general class
of ‘containers’ and three particular kinds of container: bags,
sequences and ‘alternatives’. Alternatives require special
treatment, and will be described in the next section. (We note that the ‘distributive reference’
mechanisms which use rdf:aboutEach, described in M&S section 3.3, are part
of the XML-RDF serialization rather than the RDF graph syntax, so are not
considered here. It would be relatively easy to include them,
however.)
Bags and sequences are very
similar except that bags are considered to be unordered. The natural way to
describe this mathematically is to think of bags as sequences with an
equivalence relation defined on them, and identify a bag with an equivalence
class of sequences. This amounts to saying that bag identity can be checked by
allowing permutations, but otherwise a bag is much like a sequence: in
particular, the members of a particular bag do in fact occur in a particular
order in the bag, even though that order is not considered germane to questions
of bag identity. RDFS provides essentially the same machinery for accessing the
elements of bags and of sequences, for example. For all these reasons, we will
take an abstract notion of sequence, called here a series, to be the ‘basic’
container type for RDF/S. A series
(over S) is defined to be a partial function from the integers to S. If s is a
series then s(n) is the n-th element of s; if s(n) is undefined then we will
write s(n) = ??. <comment> ( ‘??’ can be taken to be a value,
but it must be a special value which is not in the universe IR of any
interpretation.)
</comment>
We will assume that there is a
distinguished class of things called containers with subclasses called bags and
sequences (and alternatives), and that containers are series on IR. Again, this allows such apparent
oddities as containers which contain themselves, or two containers each of
which contains the other. The set of containers-series will be called
IS.
There is a special class of
container membership properties denoted by ‘rdf:_1’,
‘rdf:_2’, etc., each of which selects a member from a position in a
container. In a familiar abuse of notation, we will write ‘rdf:_n’
to refer to these, where ‘n’ is a parameter ranging over the
natural numbers/numerals. In the
graph syntax, membership of something in a container is asserted by a triple
which uses a container membership property.
>>
ICEXT(I(rdfs:Container)) = &nbs
p; IS, a subset of IR
>>
ICEXT(I(rdf:Bag)) = &nbs
p; &n
bsp; IBAG,
a subset of IS
>>
ICEXT(I(rdf:Seq)) = &nbs
p; &n
bsp; ISEQ,
a subset of IS disjoint from IBAG
>>
ICEXT(I(rdfs:ContainerMembershipProperty))
= CMIP, a
subset of IP
>> I(rdf:_n) is in CMIP
>> <x,y> is in
IEXT(I(rdf:_n))
iff x is in IS and
x(n)=y
The last condition –
really a countable infinity of conditions - is the heart of the
‘container’ notion, which can be regarded as to do with the
relational extension of container membership properties in much the same way
that rdfs:Class has to do with the relational extension of rdf:type.
This model theory makes no
further assumptions about bags and sequences. Although the model
theory does not
need to assume this, it seems natural to assume that the series corresponding
to RDF sequences are finite, ie that there is some integer length(s) such that
for any n > length(s), s(n) =
??. What is less clear is whether they should be allowed to have
‘gaps’,
ie whether s(n) = ?? for any n <= length(s). The M&S is not entirely
clear on this question, so it is left open in the model theory. (The issue surfaces when an RDF graph
gives only partial information about a container, e.g. consider whether or not
the RDF graph
xxx rdf:type
rdf:Bag
xxx rdf:_1
aaa
xxx rdf:_3
ccc
should entail
xxx rdf:_2
_:something
This issue arises very sharply
when considering alternative containers, as we will see.)
The model theory as written
also provides no formal account of what makes a bag different from a sequence,
other than insisting that they are different. One way to add such an account
would be to impose a restriction on the subset ICEXT(I(rdf:Bag)), so that it is
not permitted to contain two series which are permutations of each other. This
restriction would not alter the truth-values of any piece of RDF syntax,
however, so it is omitted from the formal model theory.
The RDF M&S provides a
number of examples of ‘intended uses’ for containers, particularly
bags, which seem to involve extra assumptions about the meanings of expressions
involving bags; for example, the observation that a triple whose first element
is a bag could be considered to be as assertion about the bag, or about the
elements of the bag, depending on the property being applied. As stated here,
the model theory only supports interpretations in which containers are
‘opaque’ objects, so that assertions involving containers are about
those containers, rather than being understood ‘transparently’ to
be asserting anything about the members of the containers. The model theory does not
prohibit transparent interpretations of
bags and sequences, but such interpretations would require extra semantic
conditions which are not stated here. Treating containers as
‘transparent’ has
quite extensive consequences, requiring the notions of satisfiability and
entailment to be re-considered. Both these points are illustrated in the next
section.
6. Alternatives
Alternatives are
transparent containers
which have a special disjunctive intepretation. When a property is asserted of
an alternative, it means that the property holds of one or more of the elements
of the alternative. In effect, such an assertion is a disjunction of the
assertions which would be got by asserting that the property holds of each of
the elements seperately.
This is the only way of
encoding a disjunction in RDF, and complicates the theory, because a
disjunction becomes weaker as extra conditions are added to it, so the
‘existential/conjunctive’
flavor of the rest of RDF , where adding more triples to a graph always
increases the information content of the graph, is now changed.
We will treat alternatives as a
class of container, following the M&S, and will add the disjunctive
interpretation as an extra semantic condition.
>>
ICEXT(I(rdf:Alt)) = a subset
IALT of IS disjoint
from
ICEXT(I(rdf:Bag))
union ICEXT(I(rdf:Seq))
>> if x is in IALT then
for any z in IP, <x y> is in IEXT(z) iff for some n, <x(n) y> is in
IEXT(z)
This last condition is
different in kind from the rest of the model theory, and illustrates the kind
of machinery which is required to give a ‘transparent’ reading to a
container.
6.1 Alterative problems:
discussion
This semantics refers the
meaning of a property ascription to an alternative container to the members of
the container as prescribed by the M&S. However, it does not address the
issue of how exactly the members of the container are specified.
If we assume that an RDF graph
specifies membership by the presence of triples of the form <[alt], rdf:_n, [thing]>, then
unless there is some way to assert that a certain set of such triples exhausts
the membership of an alternative, it is impossible to really know what an RDF
graph containing an alternative actually asserts, since the container may
contain a member which is not mentioned in the graph, and this
‘unknown’ member may make any property true of the alternative even
if the property is false of all the members which are mentioned in
the graph.
For example, consider a graph
defined by the following triples:
xxx type
rdf:Alt
xxx rdf:_1
aaa
xxx rdf:_2
bbb
xxx foo
baz
With the current model theory,
this says that the value of foo is baz for some thing in the Alt container,
which might be aaa, bbb or something else not mentioned in the graph.
It does not,
therefore, follow that one of <aaa foo baz> or <bbb foo baz> is
true; they might both be false, and still <xxx foo baz> could be true; so
asserting this property of the alternative container basically says nothing at
all about the value of the property applied to the members of the container
mentioned in the graph itself.
There are several possible ways
around this problem, but they all require some change to RDF.
We could impose a
‘closed-world’
condition on membership in an alternative, but that would make the
entire language non-monotonic.If we impose a closed-world assumption on
alternatives, by assuming that their members are precisely the ones mentioned
in the graph and no others, then this graph implicitly assumes that there are
no other members, so the example given above asserts that either aaa foo baz or
bbb foo baz, as one might expect it to; however, if one were to then add the
triple
xxx rdf:_3
ccc
to the graph, then the result
would not be entailed by the original graph, since it would then
assert something
weaker, viz. that either <aaa foo baz>, <bbb foo baz>, or <ccc
foo baz>. Since adding a triple
to a graph is thought of as preserving the meaning of the triples already
present, this makes the language non-monotonic in the sense that adding
information to a graph (in this case, information about the membership of a
container) renders some of the existing information invalid, so that any
conclusion drawn from the graph cannot safely be considered to be entailed by
the new, expanded, graph.
Another possible way around the
problem would be to insist that alternatives are not containers in the usual
sense, but are in fact disjunctions, and think of them as syntactic extensions
to the RDF language itself.
On this
view, the addition of the fifth triple to the above graph would be a change to
the logical syntax. However, this does not seem to be in the spirit of the original RDF design,
and would require the entire model theory to be re-stated.
A rather less drastic change
would be to introduce some way to explicitly assert that a container has a
certain number of members, so that the appropriate way to express the intended
meaning of the example above would be something like
this:
xxx type
rdf:Alt
xxx rdf:_1
aaa
xxx rdf:_2
bbb
xxx rdf:size
2
xxx foo
baz
where the
‘rdf:size’ property indicates that the container xxx has precisely
two elements, so that any mention of a third element of xxx would produce an
inconsistency.
Since none of these approaches
is currently use in RDF, and since the only plausible interpretation of rdf:Alt
makes any assertion about an alternative essentially vacuous, we offer this
semantics for rdf:Alt only to illustrate the problem, not as a
recommendation.
Appendix 1: Summary of model
theory
RDF:
An
interpretation I (of vocab(I)) is
defined by:
1. A nonempty set IR of resources, called the domain or
universe of I.
2. A subset IP of IR corresponding to
properties
3. A mapping IEXT from IP into the powerset of
IRx(IR union LV)
4. A mapping IS: vocab(I) -> IR
Suppose A is a mapping from
some set of anonNodes to the domain of I. anon(E) is the set of anonymous nodes
in the graph E.
>> if E is a
<qLiteral> then I(E) = XL(E)
>> if E is a
<uriref> then I(E)
= IS(E)
>> if E is an asserted
triple <s p o >
then I(E) = true iff <I(s),I(o)> is
in IEXT(I(p)), otherwise I(E)=
false.
>> if E is a ground RDF
graph then I(E) = false just in case I(E') = false for some asserted triple E'
in E, otherwise I(E) = true.
>> if E is an anonymous
node than [I+A](E) = A(E)
>> if E is an RDF graph
then I(E) = true if [I+B](E) = true for some B defined on anon(E), otherwise
I(E)= false.
RDFS:
5. A subset IC of IR,
containing classes
6. A mapping ICEXT
from IC to the powerset of
(IR union LV)
>>
ICEXT(I(rdfs:Resource)) =
IR
>> ICEXT(I(rdf:Property))
= IP
>> ICEXT(I(rdfs:Class))
= IC
>>
ICEXT(I(rdfs:Literal)) = LV
>> <x,y> is in
IEXT(I(rdf:type)) iff x is in ICEXT(y)
>> <x,y> is in
IEXT(I(rdfs:subClassOf)) iff ICEXT(x) is a proper subset of ICEXT(y)
>> <x,y> is in
IEXT(rdfs:subPropertyOf)) iff IEXT(x) is a proper subset of
IEXT(y)
>>
I(rdfs:ConstraintResource)
is in IC
>>
ICEXT(I(rdfs:ConstraintProperty)) is the intersection of IP and  
; &nb
sp; &
nbsp;  
; &nb
sp;
ICEXT(I(rdfs:ConstraintResource))
>> I(rdf:range) and
I(rdf:domain) are in
ICEXT(I(rdfs:ConstraintProperty))
>> If <x,y> is in
IEXT(I(rdf:range)) and <u,v>
is in IEXT(x) then v is in ICEXT(y)
>> If <x,y> is in
IEXT(I(rdf:domain)) and
<u,v> is in IEXT(x) then u is in ICEXT(y)
REIFICATION
For some vocabulary V, (V union
VxVxV ) is a subset of IR
>> <x,y> is in
IEXT(I(rdf:subject)) iff for some a, b, and c in V, x=<a b c> and y= a
>> <x,y> is in
IEXT(I(rdf:predicate)) iff for some a, b and c in V, x=<a b c>
and y= b
>> <x,y> is in
IEXT(I(rdf:object)) iff for some a, b and c in V, x=<a b c> and
y= c
>> x is in ICEXT(I(rdf:Statement)) iff for
some a, b and c in V, x=<a b c>
BAGS AND
SEQUENCES
IS is a set of series over IR.
IS is a subset of IR.
>>
ICEXT(I(rdfs:Container)) =
IS
>> ICEXT(I(rdf:Bag))
= a subset of IS disjoint from
ICEXT(I(rdf:Seq))
>>
ICEXT(I(rdf:Seq)) = a subset of IS disjoint from
ICEXT(I(rdf:Bag))
>>
ICEXT(I(rdfs:ContainerMembershipProperty)) = CMIP, a subset of IP
>> I(rdf:_n) is in CMIP
>> <x,y> is in
IEXT(I(rdf:_n)) iff x is in IS and
x(n)=y
ALTERNATIVES
>>
ICEXT(I(rdf:Alt)) = a subset
IALT of IS disjoint
from
ICEXT(I(rdf:Bag))
union ICEXT(I(rdf:Seq))
>> if x is in IALT then
for any z in IP and y in (IR union LV), <x, y> is in IEXT(z) iff for some n, <x(n), y> is in
IEXT(z)
RDF/RDFS model theory summary |
|
0. Domains and mappings of interpretation
I |
|
vocab(I) set
of URIs ; LV set of literal values (global) ; IR set of resources (universe);
IP subset of
IR (properties) ; IC:
subset of IR (classes). |
|
XL: literals
-> LV; IS: vocab(I) ->
IR; IEXT: IP -> subsets of
(IR x (IR union LV)); ICEXT: IC
-> subsets ofIR |
|
1. Basic equations |
|
E is |
I(E) is |
A
literal
(qLiteral) |
XL(E) |
A uri (uriref) |
IS(E) |
An asserted
triple <s p o> |
true if
<I(s), I(o)> is in IEXT(I(p)), otherwise false |
Any other
triple |
not
defined |
A ground RDF
graph |
false if
I(E’) =false for any asserted triple E’ in E,
otherwise
true |
An anonymous
node (anonNode) |
not
defined; but [I+A](E) =
A(E) |
An
RDF graph |
true if
[I+A](E) = true for some A: anon(E) -> IR, otherwise
false. |
2. Class extensions |
|
E is |
I(E) is in IC; ICEXT(I(E)) is |
rdfs:Resource |
IR (The universe of
the interpretation) |
rdf:Property |
IP (Properties; subset
of IR, domain of IEXT) |
rdfs:Class |
IC (Classes; subset of
IR, domain of ICEXT) |
rdfs:Literal |
LV (Literal
values) |
rdfs:ConstraintResource |
CR (subset of
IR) |
rdfs:ConstraintProperty |
CP = IP
intersect CR |
3. Property extensions |
|
E is |
I(E) is in IP; <x,y> is in IEXT(I(E)) iff |
rdf:type |
x is in
ICEXT(y) |
rdfs:subClassOf |
ICEXT(x) is a
proper subset of ICEXT(y) |
rdfs:subPropertyOf |
IEXT(x) is a
proper subset of IEXT(y) |
E is |
I(E) is in CP; <x,y> is in IEXT(I(E))
iff |
rdfs:range |
if
<u,v>
is in IEXT(x) then v is in
ICEXT(y) |
rdfs:domain |
if
<u,v> is in IEXT(x) then u
is in ICEXT(y) |
4. Reification |
|
V is a set of
nodes (URIs and/or anonymous nodes and/or literals) (V union
(VxVxV) ) is a subset
of IR |
|
E is |
I(E) is in IC; x is
in ICEXT(I(E)) iff |
rdf:Statement |
x= <a b c> where a, b, c are in
V |
E is |
I(E) is in IP;
<x,y> is in IEXT(I(E))
iff |
rdf:subject |
x= <a b
c> and y=a where a, b, c are in V |
rdf:predicate |
x= <a b
c> and y=b where a, b, c are in V |
rdf:object |
x= <a b
c> and y=c where a, b, c are in V |
5. Containers (Bags and sequences) |
|
IS is a set
of series over IR IS is a
subset of IR |
|
E is |
I(E) is in IC;
ICEXT(I(E)) is |
rdfs:Container |
IS |
rdf:Bag |
a subset IBAG
of IS |
rdf:Seq |
a subset ISEQ
of IS disjoint from IBAG |
rdfs:ContainerMembershipProperty |
CMIP, a subset of IP |
E is |
I(E) is in CMIP; <x,y> is in IEXT(I(E))
iff |
rdf:_n |
x is in IS
and x(n) =y |
5a. Containers (Alternatives) |
|
E is |
I(E) is in IC;
ICEXT(I(E)) is |
rdf:Alt |
a subset IALT
of IS disjoint from ISEQ and IBAG |
If x is in
IALT then for any z in IP and y in (IR union LV), <x, y> is in IEXT(z)
iff for some n, <x(n), y> is in IEXT(z) |