W3C

XML Schema Datatypes in RDF and OWL

Carroll's Personal Draft 27 Oct 2004

Editors:
Jeremy J. Carroll

Abstract

The RDF and OWL Recommendations use the simple types from XML Schema. This document discusses two questions left unanswered by these Recommendations: Which values of which simple types are the same? And What URIref should be used to refer to a user defined type?

Status of this Document

This is a personal draft for discussion by the SWBPD WG and other interested parties. The remainder of this status section is fictitious.

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This document is intended to be a part of a future W3C Working Group Note from the Semantic Web Best Practices and Deployment Working Group, part of the W3C Semantic Web Activity.

This document is intended for public discussion: it poses two questions to do with the use of XML Schema simple types within the Semantic Web, and sketches multiple answers. None of these suggestions have any recorded consensus around them. After public feedback, possibly including additional answers to these questions, the WG, in co-ordination with other W3C WGs hopes to agree on a single answer to each of the two questions. This will then form the basis of a second publication indicating a suggested best practice.

We particularly seek feedback reporting on implementation experience on these issues.

Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.


Datatypes in RDF

@@@TODO: intro to datatyping in RDF and OWL, links to relevant sections of Recs.

User Defined Datatypes

Problem statement:

What URI should be used within RDF and OWL for user defined XML Schema Datatypes.

@@@TODO exoand this, give an example, to be used throughout section.

DAML+OIL Solution

>From Peter F. Patel-Schneider:

OWL can use XML Schema non-list simple types defined at the top level of an XML Schema document and given a name, by using the URI reference constructed from the URI of the document and the local name of the simple type. That is, if U is the URI of an XML Schema document that contains,

   <xsd:schema ...>
     <xsd:simpleType name="foo">
       <xsd:restriction base="integer">
        <xsd:minInclusive value="1700">
       </xsd:restriction>
     </xsd:simpleType>
     ...
   </xsd:schema>
   

then the URI reference U#foo will be that datatype.

Implementations of OWL may choose to ignore the facets such a type.

See the definition of Senior in the example file from DAML+OIL.

This is a non-standard approach to fragIDs, and not in conformance with RFC 2396 and RFC XMLMIMETYPE @@@@which.

Component Designators Solution

Following XML Schema Component Designators WD the same example would have URI ref U#xscd(/type(foo)). XSCD also defines URIrefs for unnamed simple types within complex schema.

This is still at WD stage. It depends on XPointer which is a W3C rec, but whose relationship with RFC XMLMIMETYPE is not yet fully secured. XPointer has less than full support from the IETF community.

The resulting URIrefs cannot be used with the qname abbreviation used in N3 and RDF/A, because they end in a ")" which is not an NCNameChar.

XSCD refers to such URIrefs as being to:

The canonical schema component designator for this simple type definition

i.e. referring to the definition rather than to the type defined.

@@@ Todo Example

id Solution

Following xpointer an id attribute within an XML document can be used as a fragment to identify an XML element. This is only partially endorsed by RFC XMLMIMETYPE.

We could thus modify the example schema to:

   <xsd:schema ...>
     <xsd:simpleType id="foo" name="foo">
       <xsd:restriction base="integer">
        <xsd:minInclusive value="1700">
       </xsd:restriction>
     </xsd:simpleType>
     ...
   </xsd:schema>
   

Then use the URIref U#foo as before.

This suffers from the same defect as the XSCD solution, that what is referred to is a syntactic object (the XML element, or the datatype definition) rather than the datatype itself.

Discussion

The DAML+OIL solution is non-standard, and suffers from problems such as the non-uniqueness of names within some XML Schema.

The other two solutions are conformant with XPointer, but this is not fully endorsed by the IETF who control the XML mimetype, which RFC 2396 defers to concerning fragID semantics.

The XSCD solution is still at WD stage.

Both the XSCD and the id solutions have the problem that the proposed URIrefs refer to a syntactic thing, rather than the datatype itself. This could be seen as a property of the relevant definitions (XSCD and XPointer) being concerned about representations of resources rather than resources themselves. The URIref potentially can be seen as denoting (in the sense of RDF Semantics) the datatype, and represented by the XML element describing the datatype.

Possible advice might be to use id where possible, and once XSCD goes to Rec to also use that, particularly for XML Schema that are not under the control of the RDF or OWL author.

Comparison of Values

Problem Statement

What is the relationship between the value spaces of the various XML Schema built-in simple types when used within RDF and OWL ?

It is clear that when two simple types are derived from the same primitive type then the two values may be the same, (in the sense of owl:sameAs.

The issue concerns whether floats might be the same as decimals, or string sthe same as anyURIs, etc.

@@@ todo two or three test cases

All Primitive Types Differ

The simplest solution is to agree that all primitive XML Schema Datatypes have disjoint value spaces.

XPath 2.0 eq

XPath 2.0 provides an eq operation that will make true comparisons between some values with different primitive datatypes. e.g. 0 as a float and 0 as a decimal compare true under eq.

Most pairs of primitive types are incomparable under eq, and given the strong typing in XPath 2.0 such comparisons are errors.

Strings do not compare eq to anything other than strings.

@@@ todo datetime stuff

hexEncodeBinary and base64EncodedBinary do not compare under eq

Numeric comparisons are provided with detailed implementation instructions to allow for rounding errors etc. This results in non-transitive behaviour. @@@todo example.

As a special case INF^^xsd:float is not eq to itself.

True Values

In XML Schema Datatypes in RDF Carroll argues that the numeric types should be compared with exact numeric semantics and that string and anyURI have overlapping value space, and that hexBinary and base64Binary have the same value space.

Discussion

The XPath 2.0 position is a compromise between the two theoretically sound positions and may well be the best solution on an 80/20 rule. However, it is problematic that eq is not an equivalence relation because of the corner cases. Implementor feedback is needed as to how significant this problem is for DL reasoners.