W3C

RDF Datatyping

W3C Working Draft [Last Modified: $Date: 2002/04/11 12:35:11 $]

This version:
http://www-nrc.nokia.com/sw/rdf-datatyping.html
Latest version:
http://www-nrc.nokia.com/sw/rdf-datatyping.html
Previous version:
None.
Editors:
Pat Hayes, University of West Florida, phayes@ai.uwf.edu
Sergey Melnik, Stanford University, melnik@db.stanford.edu
Patrick Stickler, Nokia Research Center, patrick.stickler@nokia.com

Abstract

The Resource Description Framework (RDF) is a general-purpose language for representing information in the World Wide Web. RDF provides a common framework for expressing this information in such a way that it can be exchanged between applications without loss of meaning. The utility and reliability of information exchanged between applications typically requires that datatyping information be unambiguous and that the interpretation of datatyped values, which may have local representations that differ from system to system, be consistent between disparate applications. Achieving consistency in the exchange and interpretation of such datatyped information requires a well defined and standardized methodology for expressing and interpreting datatyping information. This document defines a particular methodology for expressing datatyped information in RDF and aims to provide the reader the basic fundamentals required to effectively use datatypes and datatyped values with RDF in their particular applications.

Status of this Document

This is a W3C RDF Core Working Group Working Draft produced as part of the W3C Semantic Web Activity. This document incorporates decisions made by the Working Group designed to provide the reader the basic fundamentals required to effectively use datatyping with RDF in their particular applications.

This document is being released for review by W3C members and other interested parties to encourage feedback and comments. This is the current state of an ongoing work on the RDF datatyping specification.

This is a draft document and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use it as reference material or to cite as other than "work in progress". A list of current W3C Recommendations and other technical documents can be found at http://www.w3.org/TR/.

Comments on this document are invited and should be sent to the public mailing list www-rdf-comments@w3.org. An archive of comments is available at http://lists.w3.org/Archives/Public/www-rdf-comments/.

Table of Contents

1. Introduction
  1.1 What is Datatyping?
  1.2 Desiderata for RDF Datatyping
  1.3 Related Documents
  1.4 Comments on the Examples
2. RDF Datatypes
  2.1 XML Schema: A Foundation for RDF Datatypes
    2.1.1 rdfd:Datatype
  2.2 Datatype Mapping
  2.3 Canonical Datatype Mapping
  2.4 Datatyped Literal
3. Designation of Datatyped Literals in RDF
  3.1 The Datatype Property Idiom
  3.2 The Lexical Form Idiom
    3.2.1 rdfd:lex
  3.3 The Inline Idiom
  3.4 Datatyping Constraints and Datatyped Properties
    3.4.1 rdfd:range
    3.4.2 Datatype Clashes
  3.5 RDF Datatyping and RDF Schema
    3.5.1 Domain and Range of Datatype Properties
    3.5.2 rdfd:range versus rdfs:range
    3.5.3 Datatype Classes and rdfs:subClassOf
    3.5.4 Datatype Properties and rdfs:subPropertyOf
    3.5.5 The Inline Idiom and rdfs:range
4. RDF Datatyping Model Theory
  4.1 Closure Rules
5. Levels of Interpretation
  5.1 Literal Graph Representation
  5.2 RDF Model Theory Interpretation
  5.3 RDF Datatyping Interpretation
  5.4 Extra-RDF Application Interpretation
6. RDF Schema for Datatyping
7. Appendices
  7.1 Use Cases
    7.1.1 DAML+OIL
    7.1.2 CC/PP
    7.1.3 Dublin Core
    7.1.4 ???
  7.2 RDF Datatyping and Complex (Structured) XML Datatypes
8. References
9. Acknowledgments


1. Introduction

The Resource Description Framework (RDF) is a general-purpose language for representing information in the World Wide Web. It is particularly intended for representing metadata about Web resources, such as the title, author, and modification date of a Web page, the copyright and syndication information about a Web document, the availability schedule for some shared resource, or the description of a Web user's preferences for information delivery. However, by generalizing the concept of a "Web resource", RDF can be used to represent information about anything that can be identified on the Web, such as information about items available from online shopping facilities (e.g., information about prices, publishers, and availability of books or recordings).

RDF provides a common framework for expressing this information in such a way that it can be exchanged between applications without loss of meaning. Since it is a common framework, application designers can leverage the availability of common RDF parsers and processing tools. Exchanging information between different applications means that the information may be made available to applications other than those for which it was originally created.

The utility and reliability of information exchanged between applications typically requires that datatyping information be unambiguous and that the interpretation of datatyped values, which may have local representations that differ from system to system, be consistent between disparate applications. Achieving consistency in the exchange and interpretation of such datatyped information requires a well defined and standardized methodology for expressing and interpreting datatyping information.

This document defines a particular methodology for expressing datatyped information in RDF and aims to provide the reader the basic fundamentals required to effectively use datatypes and datatyped values with RDF in their particular applications.

1.1 What is Datatyping?

...informal definition...common scenarios...blah blah blah...

...In RDF, URI References and blank nodes are both considered to be referring expressions; they are used to denote resources. Literals however are best thought of simply as syntactic 'labels' which indicate a lexical form. These lexical forms can be used to restrict the references of other nodes by using datatype schemes, but this use is optional. If a literal is used as a referring expression, it always refers to itself - that is, to a character string....

...

1.2 Desiderata for RDF Datatyping

Verbage about desiderada...

Outline motivations and issues shaping the final solution...

Why datatyping and what matters...

The following list summarizes the specific desiderada that were taken into account during the development of this specification:

It is believed that the methodology for datatyping described in this specification satisfies all of the above desiderada.

1.3 Related Documents

The complete specification of RDF consists of a number of documents:

This document is intended to augment the other parts of the RDF specification, to help information system designers and application developers understand how datatypes and datatyping can be used with RDF.

1.4 Comments on the Examples

For the sake of brevity and clarity, XML entities (e.g. &rdf;) are used in the examples provided in this specification where URI References occur as attribute values. In addition, local and qualified names are used as node and arc labels in graph illustrations, even though the actual graph will contain complete URI References as labels.

The following RDF/XML 'wrapper' should be assumed for all RDF examples used in this specification:

<?xml version="1.0"?>
<!DOCTYPE rdf:RDF [
  <!ENTITY rdf  "http://www.w3.org/1999/02/22-rdf-syntax-ns#">
  <!ENTITY rdfs "http://www.w3.org/2000/01/rdf-schema#">
  <!ENTITY xsd  "http://www.w3.org/2001/XMLSchema#">
  <!ENTITY rdfd "http://www.w3.org/2002/rdf-datatyping#">
  <!ENTITY ex   "http://www.w3.org/2002/rdf-datatyping/examples#">
]>

<rdf:RDF xmlns:rdf  ="&rdf;"
         xmlns:rdfs ="&rdfs;"
         xmlns:xsd  ="&xsd;"
         xmlns:rdfd ="&rdfd;"
         xmlns:ex   ="&ex;">

   <!-- example -->

</rdf:RDF>

2. RDF Datatypes

2.1 XML Schema: A Foundation for RDF Datatypes

The conceptual framework for RDF datatyping presented in this specification is based on the type system defined by XML Schema for simple datatypes. RDF Datatyping does not officially provide support for XML Schema complex (structured) datatypes, though see the appendices for some suggestions.

2.1.1 rdfd:Datatype

Adopting the core XML Schema definition of simple datatypes, RDF Datatyping defines an rdfd:Datatype as consisting of (a) a set of distinct values, called its value space, (b) a set of lexical representations or forms, called its lexical space, and (c) a set of canonical lexical representations which is a subset of its lexical space, called its canonical lexical space. We further include two two additional components (assumed by XML Schema), which we call (d) a datatype mapping and (e) a canonical datatype mapping, as part of an rdfd:Datatype.

In addition to having the characteristics defined above, an rdfd:Datatype may also serve as a property which joins a literal node object which is a member of the lexical space of that datatype (a lexical form) to a non-literal node subject which denotes the single member of the value space of that datatype (a datatype value) which is represented by the lexical form.

2.2 Datatype Mapping

A datatype mapping is a set of pairs whose first element belongs to the lexical space of the datatype, and the second element belongs to the value space of the datatype.

A datatype mapping satisfies the following properties:

  1. Each member of the lexical space maps to exactly one member of the value space.
  2. Each member of the value space has at least one lexical representation.

For example, the datatype mapping for the XML Schema simple datatype 'xsd:boolean', where each member of the value space (represented here as 'T' and 'F') has two lexical representations, is as follows:

Value Space {T, F}
Lexical Space {"0", "1", "true", "false"}
Datatype Mapping {<"true", T>, <"1", T>, <"0", F>, <"false", F>}

2.3 Canonical Datatype Mapping

A canonical lexical space is a subset of members from the lexical space of a datatype such that there is a one-to-one mapping between members of the canonical lexical space and members of the value space.

A canonical datatype mapping is a subset of a datatype mapping that establishes this one-to-one correspondence between members of the canonical lexical space and members of the value space.

A canonical datatype mapping satisfies the following properties:

  1. Each member of the canonical lexical space maps to exactly one member of the value space.
  2. Each member of the value space has exactly one canonical lexical representation.

For example, the canonical datatype mapping for the XML Schema simple datatype 'xsd:boolean', where each member of the value space has a single (canonical) lexical representation, is as follows:

Value Space {T, F}
Canonical Lexical Space {"true", "false"}
Canonical Datatype Mapping {<"true", T>, <"false", F>}

2.4 Datatyped Literal

A datatyped literal is a pair where the first element is a URI Reference denoting a datatype and the second element is a lexical form (literal). Following from the nature of datatypes as defined above, this pairing of datatype and lexical form unambiguously identifies a specific member of a datatype mapping or canonical datatype mapping, and hence a specific member of the value space of the datatype.

A datatyped literal can be considered a "literal-in-context" where the datatype provides the context for interpretation of the lexical form (literal) to obtain an actual value.

For example, the datatyped literals which can be defined for the XML Schema simple datatype 'xsd:boolean' are as follows:

Datatyped Literal Member of Datatype Mapping
Denoted by Datatyped Literal
Member of Value Space
Denoted by Datatyped Literal
<xsd:boolean, "true"> <"true", T> T
<xsd:boolean, "1"> <"1", T> T
<xsd:boolean, "false"> <"false", F> F
<xsd:boolean, "0"> <"0", F> F

RDF datatyping is primarily concerned with the implicit or explicit designation of datatyped literal pairings. RDF datatyping only provides for the designation of datatyped literals. The internal structure and semantics of all datatypes are opaque to RDF; i.e. membership of value and lexical spaces, datatype mappings, etc. have neither representation nor interpretation in RDF. Actual interpretation of datatyped literals (determination of the actual value denoted by the datatyped literal) is performed externally to RDF by applications which have sufficient knowledge of the particular datatypes in question. RDF datatyping only provides the datatype context within which such interpretation is to take place.

3. Designation of Datatyped Literals in RDF

A datatyped literal may be designated in several ways in RDF, according to various idioms. Three such idioms are defined by this specification: one for local (explicit) datatyping and two for global (implicit) datatyping.

3.1 The Datatype Property Idiom

The simplest way to talk about the value of a literal under a datatype mapping is to provide a node to denote the value and link that node to the datatype, using the name of the datatype as the property. For example:

<rdf:Description rdf:about="#John">
   <ex:age>
      <rdf:Description>
         <xsd:integer>25</xsd:integer>
      </rdf:Description>
   </ex:age>
</rdf:Description>

or, the equivalent contracted form

<rdf:Description rdf:about="#John">
   <ex:age xsd:integer="25"/>
</rdf:Description>
RDF Graph

says that John's age is the value paired with (represented by) the lexical form (literal) in the datatype mapping defined for the datatype xsd:integer; i.e. that John's age is the number twenty-five.

The datatype property idiom also asserts that the literal object is a member of the lexical space of the datatype. The intuitive reading of the datatype property might be "... can be represented, according to this datatype mapping, by the character string ...". A datatype property statement is valid when the literal is a well-formed lexical form of the datatype, and the subject denotes the value of the lexical form under that datatype's lexical-to-value mapping. E.g.:

<rdf:Description rdf:about="#John">
   <ex:age>
      <rdf:Description>
         <xsd:integer>pumpkin</xsd:integer>
      </rdf:Description>
   </ex:age>
</rdf:Description>

or, the equivalent contracted form

<rdf:Description rdf:about="#John">
   <ex:age xsd:integer="pumpkin"/>
</rdf:Description>
RDF Graph

would always be invalid, no matter what value is assigned to the blank node, as "pumpkin" is not a member of the lexical space of xsd:integer. This is the only way in which an RDF datatyping statement can be contradictory.

It is important to note that RDF cannot itself make such a determination of datatyping validity, but such validation can only be performed by an external application with sufficient knowledge about the particular datatype in question. RDF merely provides means for the designation of the datatyped literal pairings upon which such validation would be performed.

The datatype property idiom is the most 'local' style of literal datatyping in RDF; the interpretation imposed on the subject node by the datatype property is entirely 'inside' the triple. This means for example that the same literal can be used simultaneously in two different such triples, imposing different interpretations on two different nodes.

For example, in addition to the above statements about John's age expressed using the datatype xsd:integer, we could also say

<rdf:Description rdf:about="#Judy">
   <ex:payday>
      <rdf:Description>
         <xsd:gDay>25</xsd:gDay>
      </rdf:Description>
   </ex:payday>
</rdf:Description>

or, the equivalent contracted form

<rdf:Description rdf:about="#Judy">
   <ex:payday xsd:gDay="25"/>
</rdf:Description>
RDF Graph

to assert that Judy recieves her salary on the 25th day of each month, and both uses of the literal "25" can coexist in the same RDF graph without confusion because the datatype context within which the literal is interpreted is distinct.

Although the two property value nodes denote distinct values, the literal itself has the same meaning in both cases - which is simply the 'literal' string. It is the pairing of the lexical form and datatype together (the datatyped literal) which determines the particular value, not the literal itself. The literal itself only ever denotes the string.

Similarly, two different literal representations of the same value could be specified using either the same or even different but compatible datatype properties, all sharing the same subject:

...
   <rdf:Description>
      <xsd:integer>5</xsd:integer>
      <xsd:integer>00005</xsd:integer>
      <xsd:byte>05</xsd:byte>
   </rdf:Description>
...
RDF Graph

Obviously, this only works when the literals do in fact map to the same value under the respective datatype mappings.

3.2 The Lexical Form Idiom

Sometimes one wishes to associate a literal with a value without specifying a particular datatype. RDF Datatyping provides a special property for this kind of underdetermined association, named rdfd:lex (datatype LEXical form).

3.2.1 rdfd:lex

The rdfd:lex property associates a literal node object which is a member of the lexical space of some (possibly unknown) datatype (a lexical form) with a non-literal node subject denoting the single member of the value space of the same datatype as the lexical form and which is represented by that lexical form.

The following

...
   <rdf:Description>
      <rdfd:lex>42</rdfd:lex>
   </rdf:Description>
...
RDF Graph

simply asserts that there is a value which can be represented by the lexical form "42" under some possible datatype mapping. This does not in itself 'fix' the value, of course, but it can be used as a way of making the association between the value and a lexical form explicit, for later use or amplification. We will call this a lexical form triple. A useful way to think of the meaning of rdfd:lex is: "can be described by the lexical form".

3.3 The Inline Idiom

If one does not require or wish to have any explicit denotation of a datatype value in the RDF graph, one may simply define a property value to be a literal node which is presumed to correspond to a member of the lexical space of some datatype. This is called the 'inline' idiom, and is similar to the lexical form idiom in that it leaves the datatype context implicit, possibly to be asserted by a global rdfd:range constraint (see below). It differs from the lexical form idiom in that it provides no explicit denotation of the value whereas in the lexical form idiom the blank node denotes the actual datatype value. E.g.

<rdf:Description rdf:about="#Jane">
   <ex:age>25</ex:age>
</rdf:Description>
RDF Graph

states that the value of the property called ex:age for the subject Jane is the two-character string "25". Note that it does not say that the value is the number twenty-five.There is no way to modify the meaning of a literal node.

3.4 Datatyping Constraints and Datatyped Properties

3.4.1 rdfd:range

It is often convenient to associate a datatype with a property, so that every use of the property can be understood as asserting particular datatyping characteristics about its value. Also, in the case of the implicit inline and lexical form idioms, one must have a mechanism for specifying the datatype context within which they are to be interpreted. RDF Datatyping defines the special constraint property rdfd:range for this purpose.

Note: The rdfd:range property is not to be confused with rdfs:range, which has a different meaning (see below).

The rdfd:range property imposes a datatyping constraint on its subject such that all values of the constrained property must correspond either to a literal node which is a member of the lexical space of the specified datatype (a lexical form), or to a non-literal node denoting a member of the value space of the specified datatype (a datatype value) to which is attached by means of either the rdf:lex property or a datatype property a literal node which is a member of the lexical space of the specified datatype. In the absence of (or in addition to) a datatype property, this constraint also serves to provide the datatype context within which the lexical form is to be interpreted to determine the single datatype value represented by the lexical form.

If the object of an rdfd:range statement is not an rdfd:Datatype, then the statement is vacuous, and makes no assertion at all.

For example, we may wish to constrain the property ex:age so that its use and interpretation is bound to numerals as defined by the datatype xsd:integer:

<rdf:Description rdf:about="&ex;age">
   <rdfd:range rdf:resource="&xsd;integer"/>
</rdf:Description>

<rdf:Description rdf:about="#Jane">
   <ex:age>25</ex:age>
</rdf:Description>
RDF Graph

Thus, the datatype context within which "25" is interpreted is xsd:integer, and "25" is required to be a valid member of the lexical space of xsd:integer. The rdfd:range assertion and the literal node together constitute the datatyped literal pairing <xsd:integer,"25"> which represents the number twenty-five. Note, however, that the actual value twenty-five has no explicit denotation in the graph when using the inline idiom, unlike the datatype property and lexical form idioms.

The rdfd:range assertion both provides information necessary for the proper interpretation of the above instance of the inline idiom as well as constrains the valid set of literals to the lexical space of the specified datatype.

To illustrate this, consider the following:

<rdf:Description rdf:about="&ex;age">
   <rdfd:range rdf:resource="&xsd;integer"/>
</rdf:Description>

<rdf:Description rdf:about="#Jane">
   <ex:age>Mid-Twenties</ex:age>
</rdf:Description>
RDF Graph

which constitutes a datatype violation, because the rdfd:range assertion restricts the set of valid literal values to the lexical space of the particular datatype, and the literal "Mid-Twenties" is not a member of the lexical space of xsd:integer.

It is important to point out that only an extra-RDF application with complete knowledge about the datatype in question would be able to detect such a datatype violation. Datatypes are fully opaque to RDF and neither RDF nor RDF Schema provide generic means for datatype validation. RDF Datatyping provides mechanisms for the expression of datatyped literal pairings by specific idioms which have a well defined representation and interpretation, but cannot determine the validity of individual pairings directly. This is primarily due to RDF's role as a means of interchange between disparate systems, and in order to achieve portability and independence of platform it is necessary to forgoe any native representation of values or native datatypes in RDF itself. RDF is datatype neutral in the same manner as it is vocabulary neutral. The specific semantics for individual datatypes must reside in the application layers above RDF.

In a similar manner as for the inline idiom, an rdfd:range assertion also provides the datatyping context for the interpretation of the lexical form idiom:

<rdf:Description rdf:about="&ex;age">
   <rdfd:range rdf:resource="&xsd;integer"/>
</rdf:Description>

<rdf:Description rdf:about="#Judy">
   <ex:age>
      <rdf:Description>
         <rdfd:lex>25</rdfd:lex>
      </rdf:Description>
   </ex:age>
</rdf:Description>
RDF Graph

As in the case of the inline idiom above, the datatype context within which "25" is interpreted is the datatype xsd:integer, and likewise "25" is required to be a valid member of the lexical space of xsd:integer. Again, the rdfd:range assertion and the literal node together constitute the datatyped literal pairing <xsd:integer,"25"> which represents the number twenty-five.

In addition, and just as with the datatyping property idiom, the blank node which is the object of the ex:age property in this case is interpreted as denoting the datatype value; i.e. in this case twenty-five. In the presence of an rdfd:range assertion, the lexical form idiom has the same interpretation as the datatype property idiom.

3.4.2 Datatype Clashes

These extra datatype interpretations imposed on a property by rdfd:range apply to any such usage of the property anywhere in the RDF graph, so an rdfd:range assertion has a much wider 'scope' than a datatyping triple, and therefore needs to be used with care. For example, if several different literals are linked to a single node, then long-range datatyping can produce a conflict:

<rdf:Description rdf:about="&ex;age">
   <rdfd:range rdf:resource="&xsd;integer"/>
</rdf:Description>

<rdf:Description rdf:about="#Jane">
   <ex:age>
      <rdf:Description>
         <rdfd:lex>37</rdfd:lex>
         <rdfd:lex>29</rdfd:lex>
      </rdf:Description>
   </ex:age>
</rdf:Description>
RDF Graph

The property value node here is required by the datatype triple to have two distinct values at the same time. This situation is called a datatype clash, and is best avoided.

Similarly, if two different rdfd:range assertions are made about the same property, then they both apply to it. E.g.

<rdf:Description rdf:about="&ex;age">
   <rdfd:range rdf:resource="&xsd;integer"/>
</rdf:Description>

<rdf:Description rdf:about="&ex;age">
   <rdfd:range rdf:resource="&xsd;duration"/>
</rdf:Description>
RDF Graph

If the relevant datatypes have disjoint lexical spaces, or if their lexical-to-value maps fail to give the same values to a lexical form, then any use of the property with a literal is likely to produce a datatype clash. This requires particular care when merging information from different graphs which may have been written with different, and incompatible, conventions about literal datatyping.

Unless you are sure that the datatypes in use will not produce clashes, never use rdfd:lex with two different literals on the same blank node.

3.5 RDF Datatyping and RDF Schema

Overview of relationship between RDF Datatyping and RDF Schema...

3.5.1 Domain and Range of Datatype Properties

Datatype properties have exact domains and ranges. The domain of a datatype property corresponds to the value space of the datatype and the range of a datatype property corresponds to the lexical space of the datatype.

Normally in RDF Schema, an assertion about a range:

<rdf:Description rdf:about="#someProperty">
   <rdfs:range rdf:resource="#someClass"/>
</rdf:Description>
RDF Graph

is understood to say that the precise range of someProperty is a subset of the class someClass. This allows RDF Schema to combine multiple range assertions coherently and reflects the fact that the language has no way to express a 'lower bound' on the membership in a class. However, for datatype properties, such an assertion is true only when someClass is the exact range of the property, no more and no less. This exact range is the lexical space of the datatype. Thus, the above range statement asserts that the RDF class someClass is precisely the set of lexical forms that are acceptable to the datatype property someProperty.

Similarly, ... (verbage about domain) ...

<rdf:Description rdf:about="#someProperty">
   <rdfs:domain rdf:resource="#someClass"/>
</rdf:Description>
RDF Graph

3.5.2 rdfd:range versus rdfs:range

Discuss similarities and differences between rdfd:range and rdfs:range -- particularly with regards to validation and genericity...

...The class extension of an rdfd:Datatype is the value space of the datatype...

...An rdfd:range assertion does not entail any rdfs:range assertion. property ex:age would be the class xsd:integer. If it did, and because the class extension of an rdfd:Datatype is the value space of the datatype, then any statement with a literal object would be false, even if the literal is a member of the lexical space of the datatype in question.

...In particular, a rdfd:range assertion places no restrictions on the rdfs:range of the property. Although it would often be natural to consider the range of the property to be the lexical space of the datatype in the first case, and the value space of the datatype in the second, this should be asserted separately if the user wishes to make it explicit.

... We note that this convention uses datatype urirefs both as properties and as class names. This is quite legal in RDF, and indeed there is a basic assumption which relates the two uses: the datatype class names the value space of the datatype, which is the domain of the datatype property (recall that properties are 'backwards' lexical-to-value maps) ; so the following is true for any rdfd:Datatype ddd:

<rdf:Description rdf:about="#ddd">
   <rdfs:domain rdf:resource="#ddd"/>
</rdf:Description>
RDF Graph

To refer to the lexical domain, use rdfs:range applied to the datatype property. For example, the following two triples would restrict the rdfs:range of ex:age to be a subset of the lexical space of the datatype:

<rdf:Description rdf:about="&xsd;integer">
   <rdfs:range rdf:resource="#x"/>
</rdf:Description>

<rdf:Description rdf:about="&ex;age">
   <rdfs:range rdf:resource="#x"/>
</rdf:Description>
RDF Graph

and would therefore be suitable for use with the 'in-line' idiom used in section 1 above; while

<rdf:Description rdf:about="&ex;age">
   <rdfs:range rdf:resource="&xsd;integer"/>
</rdf:Description>
RDF Graph

asserts that the range of the property is restricted to the value space of the datatype, so would be suitable for use with the lexical triple or datatype triple idioms. However, to reiterate, the same rdfd:range assertions would be appropriate in either case.

3.5.3 Datatype Classes and rdfs:subClassOf

Discuss subclassing of datatypes, that subclass relations relate only to value spaces, not lexical spaces, etc....

3.5.4 Datatype Properties and rdfs:subPropertyOf

Discuss the special nature of datatyping properties and warn against creating subproperty relations with non-datatype properties...

3.5.5 The Inline Idiom and rdfs:range

Discuss the inherent incompatability between the inline idiom and rdfs:range with suggestions of how to address it...

4. RDF Datatyping Model Theory

The RDF Model Theory explains the fundamental model-theoretic concepts like interpretation, universe, extension etc. used for interpreting the semantics of RDF graphs. This section assumes familiarity with these basic concepts.

Suppose I is an RDF interpretation of a graph E. Then I is datatyped (with respect to a set D of datatypes) if the following is true for any datatype URI Reference ddd (with I(ddd) in D):

(1) IEXT(I(ddd)) = {<y,x> : y = L2V(I(ddd))(x)} i.e. the inverse of the datatype (lexical form to value) mapping.

(2) ICEXT(I(ddd)) = {x : <x,y> in IEXT(I(ddd))} i.e. the value space of the datatype.

(3) For any literal "LLL", if E contains the triples

   <aaa, rdfd:range, ddd>
   <bbb, aaa, "LLL">

then L2V(I(ddd))("LLL") is defined; i.e. "LLL" is in the lexical space of I(ddd).

(4) For any literal "LLL", if E contains the triples

   <aaa, rdfd:range, ddd>
   <bbb, aaa, ccc>
   <ccc, rdfd:lex, "LLL">

then I(ccc) = L2V(I(ddd))("LLL") i.e. 'rdfd:lex' is restricted to have the same meaning as the datatype property.

4.1 Closure Rules

Rule If the graph contains: then add:
0   <rdfd:Datatype, rdf:type, rdfs:Class>
<rdfd:Datatype, rdfs:subClassOf, rdf:Property>
<rdfd:range, rdf:type, rdfs:ConstraintProperty>
<rdfd:range, rdfs:domain, rdfs:Property>
<rdfd:range, rdfs:range, rdfd:Datatype>
<rdfd:lex, rdf:type, rdfs:Property>
<rdfd:lex, rdfs:domain, rdf:Resource>
<rdfd:lex, rdfs:range, rdfs:Literal>
1 <ddd, rdf:type, rdfd:Datatype> <ddd, rdfs:domain, ddd>
<ddd, rdfs:subPropertyOf, rdfd:lex>
2 <aaa, rdfd:range, ddd> <ddd, rdf:type, rdfd:Datatype>
3a <aaa, rdfd:range, ddd>
<bbb, aaa, ccc>
<ccc, rdfd:lex, "LLL">
<ccc, ddd, "LLL">
3b <aaa, rdfd:range, ddd>
<bbb, aaa, "LLL">
<bbb, aaa, ccc>
<ccc, ddd, "LLL">

Note that not all of the semantic conditions defined herein can be fully captured by closures, most notably the limitation imposed by rdfd:range and datatype properties constraining literals to be members of the lexical space of the datatype in question.

5. Levels of Interpretation

Discuss the different levels of interpretation on the graph provided by the MT, the datatyping idioms, and datatype aware applications...

5.1 Literal Graph Representation

The inline, datatype triple, and lexical form idioms; together with a datatype range constraint.

5.2 RDF Model Theory Interpretation

The RDF MT interpretation (with no datatyping semantics) is that the shared literal node value of the ex:age property in the inline idiom and the xsd:integer and rdfd:lex properties denotes itself and the blank node values of the ex:age property in the lexical form and datatype triple idioms each denote some non-literal resource.

5.3 RDF Datatyping Interpretation

The RDF Datatyping interpretation of the value of ex:age for all three idioms is the same, and is the datatyped literal pairing <xsd:integer, "25">. The value identified by the datatyped literal pairing (whatever that might be) is denoted by the blank nodes of the lexical form and datatype property idioms but has no explicit denotation in the inline idiom.

5.4 Extra-RDF Application Interpretation

The extra-RDF application interpretation, which has full knowledge of the datatype xsd:integer, of the value of ex:age for all three idioms is the same, and is the number twenty-five. The value twenty-five is denoted by the blank nodes of the lexical form and datatype property idioms but has no explicit denotation in the inline idiom.

6. RDF Schema for Datatyping

The following RDF Schema defines the ontology outlined above in its entirety. A machine readable version can be found here.

<?xml version="1.0"?>
<!DOCTYPE rdf:RDF [
  <!ENTITY rdf  "http://www.w3.org/1999/02/22-rdf-syntax-ns#">
  <!ENTITY rdfs "http://www.w3.org/2000/01/rdf-schema#">
  <!ENTITY rdfd "http://www.w3.org/2002/rdf-datatyping#">
]>

<rdf:RDF xmlns:rdf="&rdf;"
         xmlns:rdfs="&rdfs;"
         xmlns:rdfd="&rdfd;">

   <rdfs:Class rdf:about="&rdfd;Datatype">
      <rdfs:label xml:lang="en">RDF Datatype (Property)</rdfs:label>
      <rdfs:comment xml:lang="en">
         An RDF Datatype consists of a value space, a lexical space,
         an optional canonical lexical space which is a subset of
         its lexical space, and an N:1 mapping from the lexical
         space to the value space. An RDF Datatype may also serve
         as a property which joins a literal node object which is
         a member of the lexical space of that datatype (a lexical
         form) to a non-literal node subject which denotes the
         single member of the value space of that datatype (a
         datatype value) which is represented by the lexical form.
      </rdfs:comment>
      <rdfs:subClassOf rdf:resource="&rdf;Property"/>
   </rdfs:Class>

   <rdfs:ConstraintProperty rdf:about="&rdfd;range">
      <rdfs:label xml:lang="en">RDF Datatype Range</rdfs:label>
      <rdfs:comment xml:lang="en">
         This property imposes a datatyping constraint on its
         subject property such that all values of the constrained
         property must correspond either to a literal node which
         is a member of the lexical space of the specified datatype
         (a lexical form), or to a non-literal node denoting a
         member of the value space of the specified datatype (a
         datatype value) to which is attached by means of either
         the rdf:lex property or a datatype property a literal node
         which is a member of the lexical space of the specified
         datatype.  In the absence of (or in addition to) a datatype
         property, this constraint also serves to provide the
         datatype context within which the lexical form is to be
         interpreted to determine the single datatype value
         represented by the lexical form.
      </rdfs:comment>
      <rdfs:domain rdf:resource="&rdfs;Property"/>
      <rdfs:range  rdf:resource="&rdfd;Datatype"/>
   </rdfs:ConstraintProperty>
   
   <rdf:Property rdf:about="&rdfd;lex">
      <rdfs:label xml:lang="en">RDF Datatype Lexical Form</rdfs:label>
      <rdfs:comment xml:lang="en">
         This property associates a literal node object which is
         a member of the lexical space of some (possibly unknown)
         datatype (a lexical form) with a non-literal node subject
         denoting the single member of the value space of the same
         datatype as the lexical form and which is represented by
         that lexical form.
      </rdfs:comment>
      <rdfs:domain rdf:resource="&rdf;Resource"/>
      <rdfs:range  rdf:resource="&rdfs;Literal"/>
  </rdf:Property>

</rdf:RDF>

7. Appendices

The following appendices are non-normative...

7.1 Use Cases

Volunteers? ;-)

7.1.1 DAML+OIL

7.1.2 CC/PP

7.1.3 Dublin Core

7.1.4 ???

7.2 RDF Datatyping and Complex (Structured) XML Datatypes

Outline methodology for associating datatypes with XML literals such that a complex datatype is viewed similarly to a simple datatype such that its lexical space is the set of possible serializations conforming to the content model defined for the complex datatype and the value space is the set of XML Infosets represented by those serializations. An XML literal (parseType="Literal") can thus be associated with the complex datatype in the same fashion as for simple datatypes, and with similar results (in fact, one might even argue that there is no real difference whatsoever ;-)

8. References

W3C RDF Core Working Group Charter, Mar 2001, http://www.w3.org/2001/sw/RDFCoreWGCharter

W3C RDF Primer, ??? 2002, http://www.w3.org/TR/2002/WD-rdf-primer-20020319/

W3C RDF Syntax, ??? 2002, http://www.w3.org/TR/rdf-syntax-grammar/

W3C RDF Test Cases, ??? 2002, http://www.w3.org/TR/rdf-testcases/

W3C RDF Model Theory, ??? 2002, http://www.w3.org/TR/rdf-mt/

W3C RDF Schema, ??? 2002, http://www.w3.org/2001/sw/RDFCore/Schema/20010913/

XML Schema Part 2: Datatypes, ??? 2001, http://www.w3.org/TR/xmlschema-2/

DAML+OIL..., ??? 200?, http://???

OWL..., ??? 200?, http://???

CC/PP..., ??? 200?, http://???

9. Acknowledgments

This document has benefited from the input of many members of the RDF Core Working Group. Particular thanks to Jeremy Carroll, Dan Connoly, Martyn Horner, Graham Klyne, and Frank Manola for their contributions during the development of the RDF Datatyping specification. Special thanks to Graham Klyne for his contributions to the section on RDF desiderada.


RDF/XML Metadata