RDF datatyping desiderata

Graham Klyne


This note is an informal discussion document of the W3C RDF core working group, excerpted and updated from section 1.2 of Sergey Melnik's document [1] on RDF datatyping proposals. It's purpose is to list a number of criteria that may be used to evaluate alternative proposals for datatyping in RDF, with particular concern for the the value types represented by literals in RDF.

Desiderata for RDF Datatyping

  1. Backward compatibility

  2. Ability to use built-in primitive XML Schema datatypes

  3. Ability to use non-XML-Schema datatypes

  4. Ability to define datatypes using schema languages rather than relying on "built-in" data types.

  5. Ability to represent type information without an associated RDF schema

  6. Ability to reference type information in an associated RDF schema

  7. Co-existence of "global" and "local" typing mechanisms

  8. Provide account of datatyping scheme semantics

  9. Support for existing data typing idioms


1. Backward compatibility

The goal here is that existing use of RDF, RDF-handling software and RDF-based specifications will continue to be valid, and (as far as possible) produce results as intended by their authors.

2. Use of XML-schema datatypes

The datatyping proposal should provide an account of how the XML schema Built-in Primitive Datatypes [3] can be used with RDF.

The datatyping proposal should also be able to account for the use of XML schema data types derived from the built-in primitive types (i.e. all instances of anySimpleType).

No goal is currently expressed with respect to use of composite XML Schema datatypes.

(XML Schema is not intended to be used for defining/constraining RDF/XML syntax or RDF graph structures for the purposes of datatyping.)

3. Use of non-XML-Schema datatypes

The datatyping proposal should not preclude the use of non-XML-schema datatypes, such as custom or user-defined datatypes, or those from major components external to RDF, like SQL or UML datatypes.

4. Use of schema-defined datatypes

The datatyping proposal should not preclude using schema languages to define data types, rather than relying on "built-in" predefined data types. The proposal is not expected to give an account of any such schema language.

(This goal probably follows from 3.)

5. Represent type without associated RDF schema

It should be possible to include typing information into an RDF graph without depending on a (separately defined) RDF schema.

6. Reference type information in associated RDF schema

It should be possible to indirectly incorporate typing information into an RDF graph by referencing an associated RDF schema.

7. Co-existence of global and local typing mechanisms

One of the dimensions by which one can categorize datatyping proposals is by whether individual values are explicitly or implicitly typed, e.g. whether each occurrence needs to specify xsd:integer (explicit) or whether xsd:integer is specified as the rdfs:range of the property (implicit). RDF should allow users to choose either approach, and this approach is adopted in DAML+OIL. The use of implicit typing allows for compatibility with existing RDF data and much XML data. The use of both implicit and explicit typing allows for an extra check on the appropriateness of input. The use of explicit typing allows for direct control of the typing of data.

It should be possible for both forms of datatyping to coexist in the same RDF graph.

Adapted from:

(This looks rather like a restatement of goals 5, 6 above.)

8. Provide account of datatyping scheme semantics

The datatyping proposal should include a full account of data typing semantics, and how data typing interacts semantically with the other elements of RDF. This would preferably be expressed in terms of how the data typing proposal uses and/or extends the defined RDF model theory [2].

9. Support for existing data typing idioms

A number of idioms have been suggested for representing datatype information in an RDF graph. It is claimed or suggested that these are currently used in RDF applications, and that continued support would be advantageous for reasons of backward compatibility. These idioms are presented without regard for the details of their interpretation by any datatyping proposal.

These idioms are enumerated below using Notation-3 [4]. The descriptions below are intended to convey the graph form used, while being agnostic about issues of semantic denotation. The data typing proposals would need to provide an account of denotations used for any supported idiom.

A datatyping proposal may support any combination of these idioms, and the ex*: qualified names may be the same or different across those idioms supported. (The desirability or otherwise of different names is a matter for group consensus, not prejudged by this note.)

In presenting these idioms, it is useful to distinguish between "direct statements" and "schema statements", in recognition that there are different ways of handling schema statements:

(a) schema statements included explicitly in the same RDF document ("internal schema"),

(b) schema statements referenced in a separate RDF document ("external schema"), and

(c) schema statements implied and "understood" by the processing application ("implicit schema").

Presuming that:

We have three usage patterns that are equivalent, modulo the physical location or otherwise of the schema statements. To accommodate this, the idioms described below are presented in two parts:

  1. "direct statements" from which some meaning is directly derived, and
  2. where applicable, "schema statements" that can be separated from the direct statements to define an environment in which they can be evaluated.

In each case below, the intent is to express the idea that Jenny was born on 15 July 2001. The idioms simply illustrate a form of RDF graph that has this intended meaning, and do not attempt to say anything about the mechanisms for arriving at that meaning.

Idiom A
person:Jenny exA:birthDate _:A .
_:A exA:date "2001-07-15" .

(Adapted from [1])

This form of use has been suggested by Dan Connoly, in http://www.w3.org/2001/01/ct24.

Idiom B:
person:Jenny exB:birthDate "2001-07-15" .
exB:birthDate rdfs:range exB:date .

(Adapted from [1])

This is used by the CC/PP specification. A similar form also appears in the RDFM&S [5] section 5, as in:

http://www.w3.org/Home/Lassila :creator "Ora Lassila" .
Idiom C:

Same form as idiom B.

(Adapted from http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2002Jan/0045.html)

Idiom D:
Jenny exD:birthDate _:D .
_:D rdf:value "2001-07-15" .
_:D rdf:type exD:date .

(Adapted from http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2002Jan/0045.html)

This form of use has been suggested by Dan Connoly, in http://www.w3.org/2001/01/ct24, and a similar usage can be seen in RDFM&S [5], section 7.3.

Idiom E:
Jenny exE:birthDate _:E .
_:E rdf:type exE:date .
_:E ex:ISO8601 "2001-07-15" .

The idea is that the type label on the node _:E indicates only what the node is intended to represent, and the property node indicates how the value is lexically encoded.

The date example is probably not the best way to show this. Here is another example that may make the point more clearly:

Jenny exE:weight _:E .
_:E rdf:type exE:weightInPounds .
_:E ex:germanNumeral "83,5" .

as distinct from, say:

Jenny exE:weight _:E .
_:E rdf:type exE:weightInPounds .
_:E ex:americanNumeral "83.5" .

[This was suggested as a significant capability by Brian McBride. I have no specific record of its use.]

Idiom F:
Jenny exE:birthDate _:F .
_:F ex:ISO8601 "2001-07-15" .
exF:birthDate rdfs:range exF:date .

This is a fairly simple variation of idiom E, in which the type information about the node representing the birth date is provided by an rdfs:range property of the exF:birthDate predicate.


Thanks to the following for helpful comments and suggestions:


[1] Sergey Melnik, RDF Datatyping:

[2] Pat Hayes, RDF model theory :

[3] XML Schema Datatypes, Built-in Primitive Datatypes:

[4] Notation-3:

[5] RDF Model and Syntax Specification, 22-Feb-1999:

Revision history

25-Jan-2002 Added idioms E and F. Replaced idiom C with reference to idiom B, since they had the same form. Give some indication of where the various idioms have been used. Some editorial changes and attempted clarifications.


Last modified: Fri, 25-Jan-2002 , GK