Minutes for the 2010-10-21 RDB2RDF meeting + semantics discussion

Hello guys,

here are the minutes for "RDB2RDF - Formal mapping - semantics" staged
at [1]. Sorry for all the @@ but I had troubles to associate the voices
with the right people.

== Quick Summary ==

Here is a quick summary:
* importance of the 7 use-cases which should have their own section
* it's ok to have some examples covering several use-cases (if it's
* consensus around the SQL terminology instead of the Relational Algebra
one, because of the intended public
* Marcelo and ?? proposed to update EricP's documents based on the
previous points. Should be done next week.
* discussion about what "semantics" means in the context of RDB2RDF, no
consensus. See below for more information.
* to answer the previous point, Marcello will send an email with the
right informations (a digitalized book) and will give some context

== Formal Mapping ==

I did not scribe while I was speaking, at the end of the telcon. Here
are my notes about *what* is the mapping:

1. simple case: it's only a Default Mapping, so it's a function
2. difficult case: it's a function interpreting a rule language (R2ML),
so it's a function [ RDB2RDF : (RDB×R2ML) → RDF ]. In this context, the
Default Mapping is just a particular case where you use the empty value
as an inhabitant for R2ML.

In both cases, you need to start with the simple case, the Default
Mapping, then you can build something more complicated on top of it.

== Semantics ==

I'm looking forward for Marcello's document about his definition. And I
hope he will also comment what some others think.

<PatH> We could (not on IRC) draw this as a 'square' of functors which
we want to commute.

Here is it is, in the case of the Default Mapping.

Typed multi-sets   -- ?? -→   Set of triples
      ↑                            ↑
RDB semantics                 RDF semantics
      |                            |
     RDB       -- mapping -→      RDF

I don't have a name for ??. We just have to prove that it is injective
(I'm not sure we want/need a bijection at that point). The cool term to
reuse during your next dinner is "semantics preservation". Please read
[2] if you don't understand the concept.

The "semantics of RDB2RDF" makes sense *only* relatively to the
formalism used to express it. That means we need a formalism which:
* can encode RDB *and* RDF [3]
* has a formal semantics
Then the semantics of "mapping" is just its interpretation in the
semantics of the formalism. "RDB2RDF" itself is *only* a definition.

I've also explained briefly why the semantics question was important.
One the goals -- for the future -- of RDB2RDF is to allow people to
request an RDB dataset with SPARQL. That means you want the following:

Let's call SPARQL2RDF the higher-order function that maps a SPARQL query
to a SQL query: [ SPARQL2RDF : SPARQL → SQL ].

Let's call RDB2RDF the function that maps RDB to RDF: [ RDB2RDF : RDB →
RDF ].

Let's "=" be the Graph Equivalence defined in [4].

∀ rdb-dataset ∈ RDB, ∀ sparql-query ∈ SPARQL,
(SPARQL2SQL(sparql-query))(rdb-dataset) =

Or in plain English:

Given a SPARQL-to-SQL mapping to access an RDB dataset, the semantics of
the translated SPARQL query executed against this particular RDB dataset
should be equivalent to the same SPARQL query executed against the same
RDB dataset seen through the RDB2RDF mapping.

If you have that, you can say that RDB2RDF and SPARQL2SQL are consistent
with each other and that you don't loose any data.

No more, no less.


[1] http://www.w3.org/2010/10/21-rdb2rdf-minutes.html

Received on Thursday, 21 October 2010 20:32:04 UTC