Graham Klyne
15-Jan-2002
This note is an informal discussion document of the W3C RDF core working group, excerpted and updated from section 1.2 of Sergey Melnik's document [1] on RDF datatyping proposals. It's purpose is to list a number of criteria that may be used to evaluate alternative proposals for datatyping in RDF, with particular concern for the the value types represented by literals in RDF.
Backward compatibility
with existing RDF data
with existing RDF code
with existing RDF-based specifications like DAML+OIL or CC/PP
Ability to use built-in primitive XML Schema datatypes
Ability to use non-XML-Schema datatypes
Ability to define datatypes using schema languages rather than relying on "built-in" data types.
Ability to represent type information without an associated RDF schema
Ability to reference type information in an associated RDF schema
Co-existence of "global" and "local" typing mechanisms
Provide account of datatyping scheme semantics
Support for existing data typing idioms
The goal here is that existing use of RDF, RDF-handling software and RDF-based specifications will continue to be valid, and (as far as possible) produce results as intended by their authors.
The datatyping proposal should provide an account of how the XML schema Built-in Primitive Datatypes [3] can be used with RDF.
The datatyping proposal should also be able to account for the use of XML schema
data types derived from the built-in primitive types (i.e. all instances of
anySimpleType
).
No goal is currently expressed with respect to use of composite XML Schema datatypes.
(XML Schema is not intended to be used for defining/constraining RDF/XML syntax or RDF graph structures for the purposes of datatyping.)
The datatyping proposal should not preclude the use of non-XML-schema datatypes, such as custom or user-defined datatypes, or those from major components external to RDF, like SQL or UML datatypes.
The datatyping proposal should not preclude using schema languages to define data types, rather than relying on "built-in" predefined data types. The proposal is not expected to give an account of any such schema language.
(This goal probably follows from 3.)
It should be possible to include typing information to an RDF graph without depending on a (separately defined) RDF schema.
It should be possible to indirectly incorporate typing information into an RDF graph by referencing an associated RDF schema.
One of the dimensions by which one can categorize datatyping proposals is by whether individual values are explicitly or implicitly typed, e.g. whether each occurrence needs to specify xsd:integer (explicit) or whether xsd:integer is specified as the rdfs:range of the property (implicit). RDF should allow users to choose either approach, and this approach is adopted in DAML+OIL. The use of implicit typing allows for compatibility with existing RDF data and much XML data. The use of both implicit and explicit typing allows for an extra check on the appropriateness of input. The use of explicit typing allows for direct control of the typing of data.
It should be possible for both forms of datatyping to coexist in the same RDF graph.
Adapted from:
(This looks rather like a restatement of goals 4, 5 above.)
The datatyping proposal should include a full account of data typing semantics, and how data typing interacts semantically with the other elements of RDF. This would preferably be expressed in terms of how the data typing proposal uses and/or extends the defined RDF model theory [2].
A number of idioms have been suggested for representing datatype information in an RDF graph. It is claimed or suggested that these are currently used in RDF.
These idioms are enumerated below using Notation-3 [4]. The descriptions below are intended to convey the graph form used, while being agnostic about issues of semantic denotation. The data typing proposals would need to provide an account of denotations used for any supported idiom.
A datatyping proposal may support any combination of these idioms, and the ex*: qualified names may be the same or different across those idioms supported. (The desirability or otherwise of different names is a matter for group consensus, not prejudged by this note.)
In presenting these idioms, it is useful to distinguish between "direct statements" and "schema statements", in recognition that there are different ways of handling schema statements:
(a) schema statements included explicitly in the same RDF document ("internal schema"),
(b) schema statements referenced in a separate RDF document ("external schema"), and
(c) schema statements implied and "understood" by the processing application ("implicit schema").
Presuming that:
[[[The above is currently subject to some WG dispute]]]
We have three usage patterns that are equivalent, modulo the physical location or otherwise of the schema statements. To accommodate this, the idioms described below are presented in two parts:
In each case below, the intent is to express that Jenny has a birth date of 15 July 2001.
person:Jenny exA:birthDate [ exA:date "2001-07-15" ] . |
(Adapted from [1])
person:Jenny exB:birthDate "2001-07-15" . |
exB:birthDate rdfs:range exB:date . |
(Adapted from [1])
person:Jenny exP:birthDate "2001-07-15" . |
exP:birthDate rdfs:range exP:date . |
(Adapted from http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2002Jan/0045.html)
Jenny exD:birthDate [ rdf:value "2001-07-15" ; rdf:type exD:date ] . |
(Adapted from http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2002Jan/0045.html)
(Also, a similar usage can be seen in RDFM&S [5], section 7.3.)
Thanks to the following for helpful comments and suggestions:
[1] Sergey Melnik, RDF Datatyping
[2] Pat Hayes, RDF model theory
[3] XML Schema Datatypes, Built-in Primitive Datatypes
[4] Notation-3
[5] RDF Model and Syntax Specification, 22-Feb-1999
Last modified: Tue, 15-Jan-2002 , GK