W3C

R2RML: RDB2RDF Mapping Language

W3C Working Draft 5 October 2010

This version:
URL-to-be-created
Editors:
Souripriya Das, Oracle
Seema Sundara, Oracle
Richard Cyganiak, DERI, National University of Ireland, Galway, Ireland <richard@cyganiak.de>

Abstract

RDF is a directed, labeled graph data format for representing information in the Web. This specification defines the syntax and semantics of the R2RML (RDB2RDF Mapping Language) for translating Relational data to RDF.

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

The documents produced by this Working Group are:

This publication includes the RDF Schema that can be used to specify a mapping of relational data to RDF. The structure of this document will change based upon future decisions taken by the W3C RDB2RDF Working Group.

Comments on this document should be sent to public-rdb2rdf-comments@w3.org, a mailing list with a public archive.

This document was produced by the RDB2RDF Working Group, which is part of the W3C Semantic Web Activity.

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

Table of Contents

1 Introduction
    1.1 Document Outline
    1.2 Document Conventions
        1.2.1 Namespaces
        1.2.2 Data Descriptions
        1.2.3 Terminology
2 Overview
3 RDF Schema for R2RML
    3.1 RDFTermMap Class
        3.1.1 Properties of the RDFTermMap Class
            3.1.1.1 rr:property
            3.1.1.2 rr:column
            3.1.1.3 rr:inverseExpression
            3.1.1.4 rr:datatype
            3.1.1.5 rr:language
            3.1.1.6 rr:columnGraph
            3.1.1.7 rr:constantValue
            3.1.1.8 rr:isRDFType
            3.1.1.9 rr:isComputed
    3.2 Subclasses of RDFTermMap Class
        3.2.1 RDFSubjectTermMap Class
        3.2.2 IRIMap Class
        3.2.3 bNodeMap Class
        3.2.4 LiteralMap Class
    3.3 Table2TriplesMap Class
        3.3.1 Properties of the Table2TriplesMap Class
            3.3.1.1 rr:logicalTable
            3.3.1.2 rr:class
            3.3.1.3 rr:tableGraphMap
            3.3.1.4 rr:subjectMap
            3.3.1.5 rr:propertyObjectMap
            3.3.1.6 rr:computedPropertyMap
            3.1.1.7 rr:rowGraphPropertyMap
            3.1.1.8 rr:foreignKeyMap
    3.4 ForeignKey Class
        3.4.1 Properties of the ForeignKey Class
            3.4.1.1 rr:key
            3.4.1.2 rr:parentTable2TriplesMap
            3.4.1.3 rr:joinCondition

Appendices

A References
    A.1 Normative References
    A.2 Other References
B Example of SQL based RDB2RDF Mapping using the Turtle Syntax
    B.1 Sample Relational Tables
    B.2 Mapping Specification for the Tables
        B.2.1 Mapping Specification for the DEPT Table
        B.2.2 Mapping Specification for the EMP Table
        B.2.3 Mapping Specification for the LIKES Table
C CVS History


1 Introduction

RDF is a directed, labeled graph data format for representing information in the Web. RDF is often used to represent, among other things, personal information, social networks, metadata about digital artifacts, as well as to provide a means of integration over disparate sources of information. The ability to view realtional data as RDF will allow integration of relational data from disparate sources. This specification defines the schema for mapping relational data to RDF using Turtle syntax.

The R2RML language for mapping relational data to RDF is designed to meet the use cases and requirements identified by the RDB2RDF Working Group in Use Cases and Requirements [http://www.w3.org/2001/sw/rdb2rdf/wiki/Use_Cases_and_Requirements].

1.1 Document Outline

Unless otherwise noted in the section heading, all sections and appendices in this document are normative.

@@Revise when structure stable

This section of the document, section 1, introduces the R2RML specification. It presents the organization of this specification document and the conventions used throughout the specification.

Section 3 presents details of the RDFS schema for R2RML.

2 Overview

The overall flow for mapping relational data to RDF may be summarized as follows:

Given read-only access to a set of tables and/or views in a database, one can write a mapping of the content of some or all of these tables or views or any logical tables formed as SQL queries involving some of these tables and/or views to generate corresponding RDF triples. Besides generating triples, the mapping may also create an RDFS schema consisting of RDFS classes and RDF properties and RDF named graphs may be created for holding subsets of the generated triples.

Note that the RDF triples may not actually be physically generated and instead the mapping may only allow the relational data to be viewed as RDF triples.

The RDB2RDF Mapping Language, R2RML, described in this document, allows one to write a mapping to (virtually) generate a custom collection RDF graphs, named or unnamed, containing RDF triples, from a set of relational tables and/or views to which the user has (at least) read-only access.

3 RDF Schema for R2RML using Turtle Syntax

This section describes the RDF Schema that can be used for specifying mapping from a relational schema to RDF.

3.1 RDFTermMap Class

The RDFTermMap class reprersents the description of mapping to any RDF term: IRIs, blank nodes, or literals. This has two main components: mapping to an RDF property and mapping to an object value (to be associated with the property).

3.1.1 Properties of the RDFTermMap Class

This section defines the RDF properties that have the RDFTermMap class as their domain.

3.1.1.1 rr:property property
rr:property rdfs:range rdf:Resource .

This property specifies the RDF property component of the RDFTermMap instance.

3.1.1.2 rr:column property
rr:column rdfs:range xsd:string .

This property specifies the object value component of the RDFTermMap instance. Specifically, the column name of the logical table that will be used for obtaining the object value.

3.1.1.3 rr:inverseExpression property
rr:inverseExpression rdfs:range xsd:string .

This optional property specifies an expression that allows, at query processing time, use of indexes on any (underlying) relational table when accessing based on a value of a column (defined as an expression) in the logical table.

For example, for the deptId column in the logical table shown in Section 3.3.1.1, the inverse expression could be defined as follows:

"{alias.}deptno = substr({alias.}deptId,length('dept')+1)"

to facilitate use of an index on the deptno column of the dept table. Note that the actual alias (say d) used for the dept table is used to expand {alias.} (to d.).

3.1.1.4 rr:datatype property
rr:datatype rdfs:range rdf:Resource .

This property specifies the datatype for the object value component of the RDFTermMap instance.

3.1.1.5 rr:language property
rr:language rdfs:range xsd:string .

This property specifies the language for the object value component of the RDFTermMap instance. This property is applicable only if the datatype of the object value is of type xsd:string.

3.1.1.6 rr:columnGraph property
rr:columnGraph rdfs:range rr:SubjectTermMap .

This property specifies the RDF named graph for the triples that is constructed with the (property, object) pair of the RDFTermMap.

3.1.1.7 rr:constantValue property
rr:constantValue rdfs:range xsd:string .

This property specifies the optional constant value for the object component of the RDFTermMap instance.

3.1.1.8 rr:isRDFType property
rr:isRDFType rdfs:range xsd:boolean .

This property specifies if the object value component of the RDFTermMap instance is also a value for the rdf:type property.

3.1.1.9 rr:isComputed property
rr:isComputed rdfs:range xsd:boolean .

This property specifies if the object value component of the RDFTermMap instance is a computed property.

3.2 Subclasses of the RDFTermMap Class

3.3 The Table2TriplesMap Class

Table2TriplesMap is an RDFS class that allows specification of a mapping of the rows in a logical table, represented by a SQL query, or the name of a table or view and its owner, into RDF triples. Properties of a Table2TriplesMap instance allows specification of various aspects of the mapping such as, mapping of (some of) the columns in the logical table to RDF properties, name of the RDF graph(s) that would store the RDF triples, and so on.

3.3.1 Properties of the Table2TriplesMap Class

This section defines the RDF properties that have the Table2TriplesMap class as their domain.

3.3.1.1 rr:logicalTable property
rr:logicalTable rdfs:range xsd:String .

This property specifies the logical table (i.e., a SQL query or a table or view name plus its owner name) whose rows are mapped to RDF triples by this Table2TriplesMap instance. Note that each row in the logical table is identified as a resource and is used as the subject for the RDF triples.

The following example speifies that the Table2TriplesMap instance Table2TriplesMap1 maps the rows in the logical table defined by the SQL string to RDF triples.

<#Table2TriplesMap1> rr:logicalTable 
                                """Select 'dept' || deptno AS deptId
                                        , deptno
                                        , dname
                                        , loc
                                   from dept""" .

The cardinality of this property, for a Table2TriplesMap instance, must be exactly 1. That is, a Table2TriplesMap instance must have exactly one value for this property.

3.3.1.2 rr:class property
rr:class rdfs:range rdfs:Class .

This property specifies the RDFS class associated with a Table2TriplesMap instance. In the generated RDF data, resource corresponding to each row in the logical table is an instance of this RDFS class.

In the following example, Table2TriplesMap1 is associated with the the RDFS class xyz:dept using rr:class. This leads to creation of xyz:dept as an RDFS class in the RDFS schema generated from the mapping specification and in the generated RDF data, RDF resource corresponding to each row in the logical table associated with Table2TriplesMap1 is an instance of the xyz:dept class.
[@@ We need to discuss this to make it optional to generate xyz:dept as an RDFS class in the generated RDFS schema.]

<#Table2TriplesMap1> rr:class xyz:dept .
3.3.1.3 rr:tableGraphMap property
rr:tableGraphMap rdfs:range rdf:Resource .

This property specifies the graph (IRI or blank node) that would contain all the RDF triples in a Table2TriplesMap instance.

In the following example, all the RDF triples in Table2TriplesMap1 would be stored in the RDF named graph xyz:DeptGraph.

<#Table2TriplesMap1> rr:tableGraphMap xyz:DeptGraph .
3.3.1.4 rr:subjectMap property
rr:subjectMap  rdfs:range rr:RDFSubjectTermMap .

This property specifies the mapping to obtain the IRI or blank node that is used as the subject of all the RDF triples generated from one row of the table.

3.3.1.5 rr:propertyObjectMap property
rr:propertyObjectMap rdfs:range rr:RDFTermMap .

This property specifies the mapping to obtain the (property, object) pair for each RDF triple corresponding to a column and its values in the table.

3.3.1.6 rr:computedPropertyMap property
rr:computedPropertyMap rdfs:range rr:IRIMap .

This property specifies the mapping to obtain the property IRIs from a column in the table. This mapping also contains a property name that identifies this computedPropertyMap and this property name is used in a rr:propertyObjectMap to form the association. The property IRI generated (from the column identified in the rr:computedPropertyMap) for each row, along with the value of the column identified in the rr:propertyObjectMap, are used to create the (property, object) pairs.

3.3.1.7 rr:rowGraphPropertyMap property
rr:rowGraphPropertyMap rdfs:range rr:SubjectTermMap .

This property specifies the mapping to obtain the IRI or blank node that identifies the RDF named graph for all the RDF triples generated from one row in the table. The minimum cardinality for this property is zero and no restriction on maximum cardinality. If for a row, the generated graph IRI or blank node label is NULL, then the triples generated from that row go to an unnamed graph.

3.3.1.8 rr:foreignKeyMap property
rr:foreignKeyMap rdfs:range rr:ForeignKey .

This property specifies the mapping to obtain the property, and a join condition that can be used to retrieve the object (from the parent Table2Triples instance), for the generated RDF triple. This property would typically correspond to a foreign key definition in the table.

3.4 The ForeignKey Class

ForeignKey is an RDFS class that allows specification of the mapping that describes a foreign key relationship.

rr:ForeignKey a rdfs:Class .

This class has three components: a key that specifies the column constituting the constraint in the child table, the Table2TripleMap corresponding to the parent table, and the join condition corresponding to the foreign key constraint.

3.4.1 Properties of the ForeignKey Class

This section defines the RDF properties that have the ForeignKey class as their domain.

3.4.1.1 rr:key property
rr:key rdfs:range rdf:Resource .

This property specifies the key component of the ForeignKey instance.

3.4.1.2 rr:parentTable2TriplesMap property
rr:parentTable2TriplesMap rdfs:range rr:Table2TriplesMap .

This property specifies the Table2TriplesMap corresponding to the parent table component, of the ForeignKey instance.

3.4.1.3 rr:joinCondition property
rr:joinCondition rdfs:range xsd:string .

This property specifies the join condition of the ForeignKey instance.

A References

B Example of SQL based RDB2RDF Mapping using the Turtle Syntax

In this section we show an example relational schema and then map the example schema to RDF using the Turtle syntax.

B.1 Sample Relational Tables

We illustrate our example with the use of the following three relational tables

B.1.1 DEPT Table

Column NameColumn DatatypeColumn Key Constraints
deptnoNUMBERUNIQUE
dnameVARCHAR2(30)
locVARCHAR2(100)

B.1.2 EMP Table

Column NameColumn DatatypeColumn Key Constraint
empnoNUMBERPRIMARY KEY
enameVARCHAR2(100)
jobVARCHAR2(30)
deptnoNUMBERREFERENCES DEPT(deptno)
etypeVARCHAR2(30)

B.1.3 LIKES Table

Column NameColumn DatatypeColumn Key Constraint
idVARCHAR2(4000)
likeTypeVARCHAR2(30)
likedObjVARCHAR2(100)

B.2 Mapping Specification for the Tables

The tables specified in the above section are mapped to the RDF using the mapping specificied in this document. In addition to the prefix conventions specified in Section 1.2.1 Namespaces
, we also use the following prefixes in our example:

PrefixIRI
emp:http://www.example.com/emp#
dept:http://www.example.com/dept#
likes:http://www.example.com/likes#

B.2.1 Mapping Specification for the DEPT Table

<#Table2TriplesMap1>
    a rr:Table2TriplesMap;
    rr:logicalTable """
       Select 'dept' || deptno AS deptId
            , deptno
            , dname
            , loc
         from dept
      """;
		rr:class xyz:dept;
    rr:tableGraphMap xyz:DeptGraph;
    rr:subjectMap [ a rr:bNodeMap; rr:column "deptId";
                    rr:InverseExpression "(alias.)deptno = substr({alias.}deptId,length('dept')+1)"];
    rr:propertyObjectMap [ a rr:LiteralMap; rr:datatype xsd:positiveInteger;
                           rr:property dept:deptno; rr:column "deptno" ];
    rr:propertyObjectMap [ rr:property dept:name; rr:column "dname" ];
    rr:propertyObjectMap [ rr:property dept:location; rr:column "loc" ];
    rr:propertyObjectMap [ rr:property dept:COMPANY; rr:constantValue "XYZ Corporation" ];
		.

B.2.2 Mapping Specification for the EMP Table

<#Table2TriplesMap2>
    a rr:Table2TriplesMap;
    rr:logicalTable """
       Select 'xyz.com/emp/' || empno AS empURI
            , empno
            , ename
            , 'xyz.com/emp/job/'|| job AS jobTypeURI
            , job
            , deptno
            , 'xyz.com/emp/etype/'|| etype AS empTypeURI
            , etype
            , 'xyz.com/graph/'|| job || '/' || etype AS graphURI
       from emp
       """;
		rr:class xyz:emp;
    rr:subjectMap [ a rr:IRIMap; rr:column "empURI" ];
		
    rr:rowGraphPropertyMap [ rr:column "graphURI"; a rr:IRIMap];
		
    rr:propertyObjectMap [ rr:property emp:jobtype; rr:column "jobTypeURI"; rr:isRDFType "true"^^xsd:boolean];
    rr:propertyObjectMap [ rr:property emp:emptype; rr:column "empTypeURI"; rr:isRDFType "true"^^xsd:boolean];
		
    rr:propertyObjectMap [ rr:property emp:empno; rr:column "empno" ];
    rr:propertyObjectMap [ rr:property emp:name; rr:column "ename"; rr:columnGraph [ rr:constantValue "emp:empNameGraph"; a rr:IRIMap]];
    rr:propertyObjectMap [ rr:property emp:job; rr:column "job" ];
    rr:propertyObjectMap [ rr:property emp:deptNum; rr:column "deptno" ];
    rr:propertyObjectMap [ rr:property emp:etype; rr:column "etype" ];
    rr:foreignKeyMap [
      rr:key emp:c_ref_deptno;
      rr:parentTable2TriplesMap xyz:dept;
      rr:joinCondition "{child.}deptno = {parent.}deptno";
    ];
    .

B.2.3 Mapping Specification for the LIKES Table

<#Table2TriplesMap3>
    a rr:Table2TriplesMap;
    rr:logicalTable """
      Select id
           , likeType
           , likedObj
        from likes
      """;
		rr:class xyz:likes;
    rr:tableGraphMap xyz:LikesGraph;
    rr:subjectMap [ a rr:IRIMap; rr:column "id"];
    rr:ComputedPropertyMap [ rr:property likes:likeType; rr:column "likeType" ];
    rr:propertyObjectMap [ rr:property likes:likeType; rr:column "likedObj"; rr:isComputed "true"^^xsd:boolean];
    .