TDL Model Theory

Patrick,
I am using netscape composer to create some HTML, but I will leave it to you to merge with your document. (I am not sure what editor you are using, I believe Composer creates fairly simply HTML that should be straight forward to merge).

I suggest the following edits to your document, my text is not yet ready, but an early version is below.

In the introduction

When type information is omitted the Model Theory for TDL captures the ambiguous typing of the Perl programming idiom (PL).

Delete text (the same content will be addressed with new section). I find this is too aggressive because it is so early in the document.

It should be noted up front that the TDL datatyping scheme

does not require modification to the present RDF graph model
does not require modification to the present RDF/XML serialization
does not require modification to the present N3 notation
does not require modification to the present NTriples notation
reflects common RDF usage, adopting popular idioms already in use
reflects common XML Schema usage for the typing of literals
provides for both global (implicit) and local (explicit) typing concurrent ly

Modify text on TDL & Model Theory

In accordance with [RDF MT], the primary RDF syntax used in the TDL scheme is based on tidy graphs (a tidy graph is the one in which no two nodes carry the same label). The interpretation of each literal is assumed fixed and determined by its content. (For example, the interpretation of literals could be defined as an identity mapping.)

Delete appendices A&B. This stuff is not needed for TDL and acts as possible obstacles to other members of the group from supporting TDL. I suggest that if TDL is accepted then these appendices could form a Techinal Note or something like that. (Also I have appendices to add).
Add my section "Introduction to the Model-Theoretic Interpretation" at section 2.2
Add new section comparing TDL with the requirements (this replaces the deleted text from the intro).
Add my appendix "The Model Theoretic Interpretation" at end.
Add my appendix "XML Schema Union Datatypes in RDF" at end.
I wonder whether in 3.3 we should have more text explaining why TDL has full compatibiility and S does not. (e.g. "In TDL the local and global typing mechanism are the same: in the model theory the representation is identical, the lexical-value pair. This can be contrasted with S where the global idom (idiom B) operates entirely within the lexical space and cannot freely interoperate with the local idiom (idiom A) which operates principally in the value space. In S, idiom A, this allows different lexicalizations of the same value space (e.g. octal and decimal integers) to interoperate, whereas in TDL such interoperability is not possible. S, idiom B, does not prohibit such interoperability, but is highly problematic.")
Can we change the Model Theory reference to be a more recent version e.g. http://lists.w3.org/Archives/Public/www-archive/2002Jan/att-0007/01-RDF_Model_Theory.htm

An Introduction to the Model Theory for TDL

TDL is formalized as changes to the existing RDF Model Theory.
This section gives a light-weight overview, the interested reader should read appendix A for the full detail. XML Schema Union datatypes are omitted from this section; see appendix B for how they are addressed.
Datatypes are viewed as in Patel-Schneider's work [OWL: URL:???]. That is each datatype has four components, a URI, a lexical space, a value space, and a mapping.
An RDF interpretation is with respect to some set of datatypes, which corresponds to the supported datatypes in an RDF implementation. xsd:string is the only obligatory datatype, and acts as the default type.

Terminology

We modify the terminology of the Model Theory to differentiate between literals before datatyping and literals after datatyping. The modification is:

We use the term "Unicode node" to refer to a node in the graph labelled with a unicode string.
We use the term literal-value pair to refer to a pair consisting of a unicode string and a value from the value space of some datatype. The only interesting literal-value pairs are ones that belong to the mapping of some datatype.
We do not use terminology such as "literal node" or "literal value".
We refer to the set of datatypes used in an RDF interpretation as the "supported datatypes".

The Interpretation of Unicode Nodes

An interpretation maps each Unicode node to some literal-value pair, of some datatype. We know there is always at least one such pair because xsd:string is supported. The type information is checked by requiring this pair to be a member of each class associated with this node (e.g. by a range constraint) and by understanding class membership of datatype classes to refer to the mapping of the datatype.

The Interpretation of rdf:value

Following Graham Klyne's suggestion rdf:value is simply equality.

The Interpretation of Asserted Triples

Asserted triples are interpreted with respect to the function IEXT. However, the range of IEXT is extended to permit any pair of objects from the Universe.
IEXT is then restricted to respect rdf:value as equality and encodes the supported datatypes.

i.e. IEXT(rdf:value) is the identity on the universe.
For if d is a datatype then,
IEXT(rdf:type) contains the pair ( (unicode-string, value), d )
if and only if (unicode-string, value) is in the map associated with d.

IEXT is also required to be neutral with respect to the lexical space on all other properties.
i.e.
if (u1,v) and (u2,v) are two literal-value pairs in the universe and r a resource in IR and p a property in IP-{rdf:type,rdf:value} and both literal-pairs satisfy the range constraints on p then:

( r1, (u1,v) ) is in IEXT(r2) iff (r1, (u2, v) ) is in IEXT(r2)

So while this differs from previous of the model theory in that triples with literals as object are interpreted with a literal-value pair as object, such literal-value pairs are to be understood as typed data values.

Multiple types

A literal-value pair may belong to multiple types, in which case a legal RDF graph may show multiple type information for that literal-value pair, using both the local or the global idioms. Sometimes the intersection of multiple types may be surpisingly small but not empty, for example, a binary integer type and a positive decimal integer type may have intersection { ("0",0), ("1",1) }; either of these two literal-values would be legal, but a Unicode string "10" cannot be interpreted in the presence of such conflicting type information, despite being in both lexical spaces and despite the two value spaces being the same. (Contrast with S-B, which permits "10" in such a case).

Unsupported Datatypes

An RDF implementation only knows some datatypes, and in particular may not be aware of a datatype used in a particular RDF document. The Model Theory reflects this by having an interpretation with respect to some set of datatypes (the supported datatypes). The only obligatory datatype is xsd:string. In practice, documents with an unsupported datatype constrain the datatype (in that the lexical occurrences in the document must be in the lexical space of the datatype), whereas supported datatypes constrain the document (in that the document may be ill-formed in that the unicode nodes are labelled with strings that are not in the domain of the relevant datatypes). The model theory is monotone with respect to the set of supported datatypes; meaning that implementations supporting fewer datatypes will make correct inferences but not all inferences. (e.g. they will not infer a contradiction when datatyping is invalid).

An Introduction to the Model Theory for TDL

TDL is formalized as changes to the existing RDF Model Theory.
Datatypes are viewed as in Patel-Schneider's work [OWL: URL:???]. That is each datatype has four components, a URI, a lexical space, a value space, and a mapping. Unlike previous work, the mapping is a relationship rather than a function. This is specifically to accomodate XML Schema Union datatypes. For all other datatypes the mapping is a function. Each datatype is a resource and is found in the Universe of interpretation.
An RDF interpretation is with respect to some set of datatypes, minimally containing xsd:string, which can be viewed as the default datatype.

Terminology

We modify the terminology of the Model Theory to differentiate between literals before datatyping and literals after datatyping. The modification is:

We use the term "Unicode node" to refer to a node in the graph labelled with a unicode string.
We use the term literal-value pair to refer to a pair consisting of a unicode string and a value from the value space of some datatype. The only interesting literal-value pairs are ones that belong to the mapping of some datatype.
We do not use terminology such as "literal node" or "literal value".
We refer to the set of datatypes used in an RDF interpretation as the "supported datatypes".

The Interpretation of Unicode Nodes

Each Unicode node is interpreted as a literal-value pair. The literal-value pair must occur in the map of some datatype. (Hence the requirement that xsd:string is in the set of datatypes, this ensures that there is at least one possible interpretation of every Unicode node). The unicode string component of the literal-value pair is the label of the Unicode node. If there is no type information available for a Unicode node, it can hence be interpreted according to any of the supported datatypes, as long as the Unicode string is in the literal space of the datatype. In this way, TDL formalises the PL proposal.

The Universe of an Interpretation

The Universe is formed by the union of:

IR, the set of resources, which is a superset of the set of datatypes.
the value space of each datatype
the mapping of each datatype.

i.e. The Universe contains resources, typed data values, and literal-value pairs.

The Interpretation of Datatype URIs

The interpretation mapping IS is restricted to mapping any datatype URI in V to the corresponding datatype in IR. That is, a datatype URI does identify a datatype.

The Interpretation of Asserted Triples

( r1, (u1,v) ) is in IEXT(r2) iff (r1, (u2, v) ) is in IEXT(r2)