Patrick,
I am using netscape composer to create some HTML, but I will leave it to
you to merge with your document. (I am not sure what editor you are using,
I believe Composer creates fairly simply HTML that should be straight forward
to merge).
- Add my section "Introduction to the Model-Theoretic Interpretation"
at section 2.2
- Add new section comparing TDL with the requirements (this replaces
the deleted text from the intro).
- Add my appendix "The Model Theoretic Interpretation" at end.
- Add my appendix "XML Schema Union Datatypes in RDF" at end.
- I wonder whether in 3.3 we should have more text explaining why TDL
has full compatibiility and S does not. (e.g. "In TDL the local and global
typing mechanism are the same: in the model theory the representation is
identical, the lexical-value pair. This can be contrasted with S where the
global idom (idiom B) operates entirely within the lexical space and cannot
freely interoperate with the local idiom (idiom A) which operates principally
in the value space. In S, idiom A, this allows different lexicalizations
of the same value space (e.g. octal and decimal integers) to interoperate,
whereas in TDL such interoperability is not possible. S, idiom B, does not
prohibit such interoperability, but is highly problematic.")
- Can we change the Model Theory reference to be a more recent version
e.g. http://lists.w3.org/Archives/Public/www-archive/2002Jan/att-0007/01-RDF_Model_Theory.htm
An Introduction to the Model Theory for TDL
TDL is formalized as changes to the existing RDF Model Theory.
This section gives a light-weight overview, the interested reader should
read appendix A for the full detail. XML Schema Union datatypes are omitted
from this section; see appendix B for how they are addressed.
Datatypes are viewed as in Patel-Schneider's work [OWL: URL:???]. That is
each datatype has four components, a URI, a lexical space, a value space,
and a mapping.
An RDF interpretation is with respect to some set of datatypes, which corresponds
to the supported datatypes in an RDF implementation. xsd:string is the only
obligatory datatype, and acts as the default type.
Terminology
We modify the terminology of the Model Theory to differentiate between
literals before datatyping and literals after datatyping. The modification
is:
- We use the term "Unicode node" to refer to a node in the graph labelled
with a unicode string.
- We use the term literal-value pair to refer to a pair consisting of
a unicode string and a 'typed value'. The interesting literal-value pairs
are ones that belong to the mapping of some datatype.
- We do not use terminology such as "literal node" or "literal value".
- We refer to the set of datatypes used in an RDF interpretation as the
"supported datatypes".
The Interpretation of Unicode Nodes
An interpretation maps each Unicode node to some literal-value pair. The
unicode string component is given by the label on the node. The type information
is checked by requiring this pair to be a member of each class associated
with this node (e.g. by a range constraint) and by understanding class membership
of datatype classes to refer to the mapping of the datatype. Note that for
techniocal reasons the 'typed value' of the interpretation of untyped Unicode
nodes is unrestricted, i.e. there is no default type.
The Interpretation of rdf:value
Following Graham Klyne's suggestion rdf:value is simply equality.
The Interpretation of Asserted Triples
The biggest changes to the model theory are in the interpretation of triples.
Those with predicate rdf:value or rdf:type are both treated specially: rdf:value
as equality, and rdf:type knows the supported datatypes and treats them essentially
as the map of the datatype (i.e. <s, rdf:type, d> iff I(s) is
a lliteral-value pair in the map of d).
For other triples the model theory is unchanged, although in the Universe
of interpretation the old literal values are now represented as literal value
pairs, and hence the representation of triples with literal objects is slightly
different.
Multiple types
A literal-value pair may belong to multiple types, in which case a legal
RDF graph may show multiple type information for that literal-value pair,
using both the local or the global idioms. Sometimes the intersection of
multiple types may be surpisingly small but not empty, for example, a binary
integer type and a positive decimal integer type may have intersection {
("0",0), ("1",1) }; either of these two literal-values would be legal, but
a Unicode string "10" cannot be interpreted in the presence of such conflicting
type information, despite being in both lexical spaces and despite the two
value spaces being the same. (Contrast with S-B, which permits "10" in such
a case).
Unsupported Datatypes
An RDF implementation only knows some datatypes, and in particular may
not be aware of a datatype used in a particular RDF document. The Model Theory
reflects this by having an interpretation with respect to some set of datatypes
(the supported datatypes). In practice, documents with an unsupported
datatype constrain the datatype (in that the lexical occurrences in the document
must be in the lexical space of the datatype), whereas supported datatypes
constrain the document (in that the document may be ill-formed in that the
unicode nodes are labelled with strings that are not in the domain of the
relevant datatypes). The model theory is monotone with respect to the set
of supported datatypes; meaning that implementations supporting fewer datatypes
will make correct inferences but not all inferences. (e.g. they will not infer
a contradiction when datatyping is invalid).
Appendix The Model Theory for TDL
Datatypes are viewed as in Patel-Schneider's work [OWL: URL:???]. That is
each datatype d has four components:
- u(d)
- the URI reference
- L(d)
- the lexical space (subset of the se of Unicode strings)
- V(d)
- the value space,
- M(d)
- a subset of L(d) x V(d), such that there is at least one
pair in M(d) for each string of L(d), and at least one pair in M(d) for each
value in V(d).
Unlike previous work, the mapping is a relationship rather than a function.
This is specifically to accomodate XML Schema Union datatypes. A full discussion
of these is found in the next appendix. For all other datatypes the mapping
is a function. Each datatype is a resource and is found in the Universe of
interpretation.
An RDF interpretation is with respect to some possibly empty set, DT, of
datatypes. DT is a subset of IR, the set of resources.
We use a set IR of resources, the set of U of Unicode strings and a set VL
of values. V(d) is a subset of VL for every d in DT. The Universe is IR
union ( U x VL )
Terminology
- Unicode node
- a node in the graph labelled with a unicode string.
- literal-value pair
- a pair in U x VL.
The Interpretation of Unicode Nodes
Each Unicode node is interpreted as a literal-value pair.
If E is labelled with u, then I(E) = (u,v) for some v in VL.
The Interpretation of Datatype URIs
If E is a uriref and the label of E=u(d) for some d in DT, then I(E) = d.
The Interpretation of Blank Nodes
The mapping A on blank nodes is unrestricted and a blank node can be interpreted
as any object in the Universe (including literal-value pairs).
The Interpretation of Asserted Triples
The function IEXT is modified as follows:
IEXT maps the set of properties IP into the powerset of ( Universe x Universe
).
IEXT(rdf:value) is the identity of the Univers
For each d in DT
IEXT(rdf:type) contains the pair ( (unicode-string, value),
d )
if and only if (unicode-string, value) is in the map
associated with d.
Idiom P
A range constraint on a property p to the URI of a datatype d in DT, imply
that:
- d is a class
- the objects of p "belong" to that class (using
ICEXT, which is defined in terms of IEXT(rdf:type), which is defined above
for d)
Hence the object of p must be interpreteed as a literal-value pair
in the map of the datatype.
Idiom D
The interpretation of the blank node, subject of the rdf:value is constrained
to be the same as the interpretation of the unicode node, by the constraint
on IEXT(rdf:value).
Moreovoer this literal-value pair is required to be a mapping in the datatype
by the interpretation of the rdf:type edge.
Appendix Union Datatypes
In this approach the ordered preference of an XML Schema Union datatype is
not respected. When a String is in the domain of more than one of the types
in the union then that String is ambiguous unless further type information
disambiguates it. (We note that in XML Schema it is possible to overrule
the default type using an xsi:type attribute). RDF Model Theory is monotone
and hence does not accomodate the default mechanism inherent in XML Schema
Union datatypes. Moreover, a strong preference is given for using rdf:type
rather than xsi:type to disambiguate the union.
There is no requirement to disambiguate the union, and value it can
be left as ambiguous.