Re: Minutes for the 2010-10-21 RDB2RDF meeting + semantics discussion from Alexandre Bertails on 2010-10-22 (public-rdb2rdf-wg@w3.org from October 2010)

From: Alexandre Bertails <bertails@w3.org>
Date: Fri, 22 Oct 2010 10:11:20 -0400
To: Marcelo Arenas <marcelo.arenas1@gmail.com>
Cc: RDB2RDF WG <public-rdb2rdf-wg@w3.org>
Message-ID: <1287756680.21135.1517.camel@simplet>
Thanks Marcelo,

you gave us some homework for the week-end :-)

Alexandre.

On Fri, 2010-10-22 at 10:47 -0300, Marcelo Arenas wrote:
> Dear All,
> 
> On Thu, Oct 21, 2010 at 5:32 PM, Alexandre Bertails <bertails@w3.org> wrote:
> > Hello guys,
> >
> > here are the minutes for "RDB2RDF - Formal mapping - semantics" staged
> > at [1]. Sorry for all the @@ but I had troubles to associate the voices
> > with the right people.
> >
> > == Quick Summary ==
> >
> > Here is a quick summary:
> > * importance of the 7 use-cases which should have their own section
> > * it's ok to have some examples covering several use-cases (if it's
> > said)
> > * consensus around the SQL terminology instead of the Relational Algebra
> > one, because of the intended public
> > * Marcelo and ?? proposed to update EricP's documents based on the
> > previous points. Should be done next week.
> > * discussion about what "semantics" means in the context of RDB2RDF, no
> > consensus. See below for more information.
> > * to answer the previous point, Marcello will send an email with the
> > right informations (a digitalized book) and will give some context
> 
> In the conference call, I argued that we need a syntax and semantics
> for the mapping language, but we did not reach a consensus about
> whether the mapping languages should have a semantics. To explain what
> I mean by the semantics of a mapping language, below I give some
> information about how the problem of data exchange (or data
> translation) is usually formalized in the database context.
> 
> In the relational databases context, the data exchange (or data
> translation) problem is usually formalized as follows. You are given a
> source relational schema S, a target relational schema T (T could
> consist of the table Triple for storing RDF triples), and a mapping M
> that specifies how to translate data from the source into the target
> [1], and then the problem is to take data structured under the source
> schema S and creating an instance of the target schema T according to
> the conditions specified by M. An important issue in this setting is
> to define a mapping language for expressing mappings like M, which
> means to define the syntax and semantics of this mapping language:
> 
> - The syntax of the mapping language is usually defined by considering
> a syntactic restriction of first-order logic, like source-to-target
> tuple-generating dependencies (see [1] for the formal definition of
> these dependencies, which are widely used in this area).
> 
> - The semantics of the mapping language refers to the following
> problem: Given a source instance I, a target instance J and a mapping
> M, is J a valid translation of I according to M? If M is specified by
> using a set F of first-order logic sentences, then the semantics of
> the mapping language is given in terms of the semantics for
> first-order logic: J is a valid translation of I under M if and only
> if (I,J) satisfies F in the usual first-order logic sense (all these
> ideas are formalized in [1]).
> 
> It is important to notice that in the above setting, it could be the
> case that there exist several possible translations for the same
> source instance (as M could, for example, create new values in the
> target), so one has to formally define what is the target instance
> that reflects the source data as accurately as possible. Once you have
> done that, you can consider the mapping M as a function that maps each
> source instance I into the "better" translation of I according to M
> (this "better" solution is usually the "canonical universal solution"
> or the "core of the canonical universal solution", which are formally
> defined in [1,2]).
> 
> A survey about the tools developed at IBM by following the above
> approach can be found in [3] (references [1,2,3] can be downloaded
> from http://www.almaden.ibm.com/cs/people/fagin). In [4], the author
> shows how the data exchange problem is formalized in logical terms,
> and what some of the important issues in this area are (this survey
> can be download from
> http://www.sigmod.org/publications/sigmod-record/0903/index.html).
> Finally, there are also two short books where you can find information
> about the above approach. In [5], it is given a fairly complete
> picture of the main issues in data exchange, which also includes the
> case of XML  data (it should be noticed that the above approach is
> also applicable in other data models like XML and RDF). In [6],  it is
> shown how some rule languages (like non-recursive Datalog with
> equality and safe negation, and some of its extensions) have been used
> in data integration/exchange. These two short books are available
> electronically in many libraries.
> 
> All the best,
> 
> Marcelo
> 
> 
> [1] R. Fagin, P. G. Kolaitis, R. J. Miller, L. Popa: Data exchange:
> semantics and query answering. Theor. Comput. Sci. 336(1): 89-124,
> 2005.
> 
> [2] R. Fagin, P. G. Kolaitis, L. Popa: Data exchange: getting to the
> core. ACM Trans. Database Syst. 30(1): 174-210, 2005.
> 
> [3] R. Fagin, L. M. Haas, M. A. Hernández, R. J. Miller, L. Popa, Y.
> Velegrakis: Clio: Schema Mapping Creation and Data Exchange.
> Conceptual Modeling: Foundations and Applications 2009: 198-236.
> 
> [4] P. Barcelo: Logical foundations of relational data exchange.
> SIGMOD Record 38(1): 49-58, 2009.
> 
> [5] M. Arenas, P. Barcelo, L. Libkin, F. Murlak: Relational and XML
> Data Exchange Morgan & Claypool Publishers, 2010.
> 
> [6] M. Genesereth. Data Integration: The Relational Logic Approach.
> Morgan & Claypool Publishers, 2010.
>
Received on Friday, 22 October 2010 14:11:21 UTC