TDL Model Theory

Patrick,
I am using netscape composer to create some HTML, but I will leave it to you to merge with your document. (I am not sure what editor you are using, I believe Composer creates fairly simply HTML that should be straight forward to merge).

Add my section "Introduction to the Model-Theoretic Interpretation" at section 2.2
Add new section comparing TDL with the requirements (this replaces the deleted text from the intro).
Add my appendix "The Model Theoretic Interpretation" at end.
Add my appendix "XML Schema Union Datatypes in RDF" at end.
I wonder whether in 3.3 we should have more text explaining why TDL has full compatibiility and S does not. (e.g. "In TDL the local and global typing mechanism are the same: in the model theory the representation is identical, the lexical-value pair. This can be contrasted with S where the global idom (idiom B) operates entirely within the lexical space and cannot freely interoperate with the local idiom (idiom A) which operates principally in the value space. In S, idiom A, this allows different lexicalizations of the same value space (e.g. octal and decimal integers) to interoperate, whereas in TDL such interoperability is not possible. S, idiom B, does not prohibit such interoperability, but is highly problematic.")
Can we change the Model Theory reference to be a more recent version e.g. http://lists.w3.org/Archives/Public/www-archive/2002Jan/att-0007/01-RDF_Model_Theory.htm

An Introduction to the Model Theory for TDL

TDL is formalized as changes to the existing RDF Model Theory.
This section gives a light-weight overview, the interested reader should read appendix A for the full detail. XML Schema Union datatypes are omitted from this section; see appendix B for how they are addressed.
Datatypes are viewed as in Patel-Schneider's work [OWL: URL:???]. That is each datatype has four components, a URI, a lexical space, a value space, and a mapping.
An RDF interpretation is with respect to some set of datatypes, which corresponds to the supported datatypes in an RDF implementation. xsd:string is the only obligatory datatype, and acts as the default type.

Terminology

We modify the terminology of the Model Theory to differentiate between literals before datatyping and literals after datatyping. The modification is:

We use the term "Unicode node" to refer to a node in the graph labelled with a unicode string.
We use the term literal-value pair to refer to a pair consisting of a unicode string and a 'typed value'. The interesting literal-value pairs are ones that belong to the mapping of some datatype.
We do not use terminology such as "literal node" or "literal value".
We refer to the set of datatypes used in an RDF interpretation as the "supported datatypes".

The Interpretation of Unicode Nodes

An interpretation maps each Unicode node to some literal-value pair. The unicode string component is given by the label on the node. The type information is checked by requiring this pair to be a member of each class associated with this node (e.g. by a range constraint) and by understanding class membership of datatype classes to refer to the mapping of the datatype. Note that for techniocal reasons the 'typed value' of the interpretation of untyped Unicode nodes is unrestricted, i.e. there is no default type.

The Interpretation of rdf:value

Following Graham Klyne's suggestion rdf:value is simply equality.

The Interpretation of Asserted Triples

The biggest changes to the model theory are in the interpretation of triples.
Those with predicate rdf:value or rdf:type are both treated specially: rdf:value as equality, and rdf:type knows the supported datatypes and treats them essentially as the map of the datatype (i.e. <s, rdf:type, d> iff I(s) is a lliteral-value pair in the map of d).
For other triples the model theory is unchanged, although in the Universe of interpretation the old literal values are now represented as literal value pairs, and hence the representation of triples with literal objects is slightly different.

Multiple types

A literal-value pair may belong to multiple types, in which case a legal RDF graph may show multiple type information for that literal-value pair, using both the local or the global idioms. Sometimes the intersection of multiple types may be surpisingly small but not empty, for example, a binary integer type and a positive decimal integer type may have intersection { ("0",0), ("1",1) }; either of these two literal-values would be legal, but a Unicode string "10" cannot be interpreted in the presence of such conflicting type information, despite being in both lexical spaces and despite the two value spaces being the same. (Contrast with S-B, which permits "10" in such a case).

Unsupported Datatypes

An RDF implementation only knows some datatypes, and in particular may not be aware of a datatype used in a particular RDF document. The Model Theory reflects this by having an interpretation with respect to some set of datatypes (the supported datatypes). In practice, documents with an unsupported datatype constrain the datatype (in that the lexical occurrences in the document must be in the lexical space of the datatype), whereas supported datatypes constrain the document (in that the document may be ill-formed in that the unicode nodes are labelled with strings that are not in the domain of the relevant datatypes). The model theory is monotone with respect to the set of supported datatypes; meaning that implementations supporting fewer datatypes will make correct inferences but not all inferences. (e.g. they will not infer a contradiction when datatyping is invalid).

Appendix The Model Theory for TDL

Datatypes are viewed as in Patel-Schneider's work [OWL: URL:???]. That is each datatype d has four components:

u(d): the URI reference
L(d): the lexical space (subset of the se of Unicode strings)
V(d): the value space,
M(d): a subset of L(d) x V(d), such that there is at least one pair in M(d) for each string of L(d), and at least one pair in M(d) for each value in V(d).

Unlike previous work, the mapping is a relationship rather than a function. This is specifically to accomodate XML Schema Union datatypes. A full discussion of these is found in the next appendix. For all other datatypes the mapping is a function. Each datatype is a resource and is found in the Universe of interpretation.
An RDF interpretation is with respect to some possibly empty set, DT, of datatypes. DT is a subset of IR, the set of resources.
We use a set IR of resources, the set of U of Unicode strings and a set VL of values. V(d) is a subset of VL for every d in DT. The Universe is IR union ( U x VL )

Terminology

Unicode node: a node in the graph labelled with a unicode string.
literal-value pair: a pair in U x VL.

The Interpretation of Unicode Nodes

Each Unicode node is interpreted as a literal-value pair.
If E is labelled with u, then I(E) = (u,v) for some v in VL.

The Interpretation of Datatype URIs

If E is a uriref and the label of E=u(d) for some d in DT, then I(E) = d.

The Interpretation of Blank Nodes

The mapping A on blank nodes is unrestricted and a blank node can be interpreted as any object in the Universe (including literal-value pairs).

The Interpretation of Asserted Triples

The function IEXT is modified as follows:
IEXT maps the set of properties IP into the powerset of ( Universe x Universe ).
IEXT(rdf:value) is the identity of the Univers

For each d in DT
IEXT(rdf:type) contains the pair ( (unicode-string, value), d )
if and only if (unicode-string, value) is in the map associated with d.

Idiom P

A range constraint on a property p to the URI of a datatype d in DT, imply that:

d is a class
the objects of p "belong" to that class (using ICEXT, which is defined in terms of IEXT(rdf:type), which is defined above for d)

Hence the object of p must be interpreteed as a literal-value pair in the map of the datatype.

Idiom D

The interpretation of the blank node, subject of the rdf:value is constrained to be the same as the interpretation of the unicode node, by the constraint on IEXT(rdf:value).
Moreovoer this literal-value pair is required to be a mapping in the datatype by the interpretation of the rdf:type edge.

Appendix Union Datatypes

In this approach the ordered preference of an XML Schema Union datatype is not respected. When a String is in the domain of more than one of the types in the union then that String is ambiguous unless further type information disambiguates it. (We note that in XML Schema it is possible to overrule the default type using an xsi:type attribute). RDF Model Theory is monotone and hence does not accomodate the default mechanism inherent in XML Schema Union datatypes. Moreover, a strong preference is given for using rdf:type rather than xsi:type to disambiguate the union.
There is no requirement to disambiguate the union, and value it can be left as ambiguous.