Re: R2RML draft - new introduction from Richard Cyganiak on 2010-10-13 (public-rdb2rdf-wg@w3.org from October 2010)

From: Richard Cyganiak <richard@cyganiak.de>
Date: Wed, 13 Oct 2010 12:01:27 +0100
To: ashok.malhotra@oracle.com
Cc: RDB2RDF WG <public-rdb2rdf-wg@w3.org>
Message-Id: <E0572E72-7B0A-400F-875A-17A22BF12A5D@cyganiak.de>

Ashok,

On 12 Oct 2010, at 23:59, ashok malhotra wrote:
> One question.  You say:
>
>> The input to an R2RML mapping is a relational database.
>
> Is it a relational database or a relational database schema?

Good question. I discussed this a bit with Michael this morning.

Definition: An RDB schema consists of the table *declarations*, but it  
does not include the actual *data* in the tables.

Definition: A relational database on the other hand consists of both  
an RDB schema, and data that populate the tables.

The input to an R2RML mapping has to include the actual data, because  
otherwise how could a transformed form of the data be part of the  
mapping's output? So the input is indeed a relational database.

On the other hand, an R2RML mapping is *specific* to an RDB schema.  
That is, it only works with an input database that conforms to a  
certain schema (contains certain tables and columns). Let's call that  
schema the “input schema” of the mapping. One could then say that the  
input to a mapping is any database that conforms to the input schema.  
In other word, the domain of an R2RML mapping is the set of all  
databases that conform to the mapping's input schema.

I think the notion of an input schema is actually really valuable for  
writing the spec. For example, it allows us to say things like, “the  
SQL query in a TriplesMap MUST be a SELECT query that can be validly  
executed over the input schema.”

Richard

Received on Wednesday, 13 October 2010 11:02:05 UTC