RDF "core": abstract syntax and semantics
-----------------------------------------
Graham Klyne, 21-Jun-2001
1. Categories
In this section, a number of categories of term that appear in the abstract syntax are listed. Each category is associated with three kinds of symbol:
- a syntactical instance of that category
- a syntacical category, being the set of all possible syntactical instances
- a domain of interpetation, being a set containing interpretations of the syntactical instances.
(This style of presentation is adapted from that used by Strachey & Scott in "Toward a Mathematical Semantics for Computer Languages".)
1.1 Terminal symbols
These are categories that appear in the abstract syntax as terminal symbols.
n,n1,n2: N - Nodes (may be represented by Qnames or URIs)
l,l1,l2: L - Literals (may be represented by strings, data: URIs or arbitrary XML elements)
p,p1,p2: P - Predicates (may be represented by Qnames or URIs)
rdf:type : distinguished member of P
rdf:subject : distinguished member of P
rdf:object : distinguished member of P
rdf:predicate : distinguished member of P
rdf:Statement : distinguished member of N
1.2 Nonterminal symbols
These are categories that appear on the left hand side of productions of the abstract syntax.
s,s1,s2: S - Simple-expressions (currently: "triple" or "reification")
r,r1,r2: R - Reifications (description of a Node that denotes a statement)
t,t1,t2: T - Triples (generic ground fact, expressed as triple)
v,v1,v2: V - Values (nodes or literals)
1.3 Distinguished symbol
This symbol represents a valid RDF expression (a wff of RDF)
g,g1,g2: G - Graphs
2. Abstract syntax
::= denotes a production in the syntax metalanguage,
| denotes alternative productions in the syntax metalanguage,
is a placeholder for an empty sequence of symbols
( ) are used as "punctuation" literals surrounding a triple.
g ::= s1 | g1 g2 |
s ::= r1 | t1
r ::= ( n1 rdf:type rdf:Statement )
( n1 rdf:predicate p1 )
( n1 rdf:subject n1 )
( n1 rdf:object v1 )
t ::= ( n1 p1 v1 )
v ::= n1 | l1
3. Semantics
What is the domain of interpretation of an RDF graph?
At first sight, it may appear to be simply 'true' or 'false'. In the absence of reification, this is enough.
But if a graph contains statement reifications, we need to somehow note their presence so that other statements that reference them can be given appropriate interpretations. Thus, a graph is interpreted here as a truth value AND a set of statement Reifications.
3.1 Interpretation funtions
IG : G -> < Boolean, Reifications* >
The domain of interpretation of an RDF graph is 'true' or 'false', together with
a set of statement reifications.
IS : S -> < Boolean, Reifications* >
The domain of interpretation of a simple expression is 'true' or 'false',
together with a set of statement reifications.
IR : R -> Reifications, where
Reifications =
The domain of interpretation of a reification is a 4-tuple consisting of
members of Nodes, Predicates, Nodes and Values (see below).
IT : T -> Boolean
The domain of interpretation of a triple is 'true' or 'false'.
IP : Predicates, where
Predicates = Nodes x Values -> Boolean
The domain of interpretation of a property is a binary predicate.
IV : V -> Values, where
Values = Nodes + Literals
The domain of interpretation of a value is a node or a literal
IN : N -> Nodes
The domain of interpretation of a node is a member of Nodes.
IL : L -> Literals
The domain of interpretation of a literal is a member of Literals.
Thus, the domain of interpretation is defined in terms tuples, unions and functions of the following sets of values:
Nodes, Literals, Boolean
3.2 Definition of an interpretation
An interpretation is defined in terms of interpretation functions over each of the syntacic catagories.
Each function is defined by case enumeration over the corresponding syntactic productions. To aid the distinction between syntactic and semantic category values, syntactic values are enclosed in square brackets [...].
'&' is used here as a metalanguage operator in the Boolean subset of the domain of interpretation, having the meaning usually associated with Boolean AND.
'{...}' is used here as a metalanguage expression meaning some set of values.
'|' is used here as a metalanguage operator meaning set union.
'.n' is used here as a metalanguage operator for selecting the n'th element of a tuple.
IG[s1] = IS[s1]
IG[g1 g2] = < IG[g1].1 & IG[g2].1, IG[g1].2 | IG[g2].2 >
IG[] = < true, {} >
IS[r1] = < true, { IR[r1] } >
IS[t1] = < IT[t1], {} >
IR[( n1 rdf:type rdf:Statement )
( n1 rdf:predicate p1 )
( n1 rdf:subject n2 )
( n1 rdf:object v1 ) ]
=
IT[(n1 p1 v1)] = IP[p1](IP[n1])(IV[v1])
IV[n1] = IN[n1]
IV[l1] = IL[l1]
IP[p] = some function of type (Nodes x Values -> Boolean)
assigned to 'p' by the interpretation.
IN[n] = some member of Nodes assigned to 'v' by the interpretation.
IL[l] = some member of Literals assigned to 'l' by the interpretation.
....