- From: Marcelo Arenas <marcelo.arenas1@gmail.com>
- Date: Fri, 22 Oct 2010 10:47:42 -0300
- To: Alexandre Bertails <bertails@w3.org>
- Cc: RDB2RDF WG <public-rdb2rdf-wg@w3.org>
Dear All, On Thu, Oct 21, 2010 at 5:32 PM, Alexandre Bertails <bertails@w3.org> wrote: > Hello guys, > > here are the minutes for "RDB2RDF - Formal mapping - semantics" staged > at [1]. Sorry for all the @@ but I had troubles to associate the voices > with the right people. > > == Quick Summary == > > Here is a quick summary: > * importance of the 7 use-cases which should have their own section > * it's ok to have some examples covering several use-cases (if it's > said) > * consensus around the SQL terminology instead of the Relational Algebra > one, because of the intended public > * Marcelo and ?? proposed to update EricP's documents based on the > previous points. Should be done next week. > * discussion about what "semantics" means in the context of RDB2RDF, no > consensus. See below for more information. > * to answer the previous point, Marcello will send an email with the > right informations (a digitalized book) and will give some context In the conference call, I argued that we need a syntax and semantics for the mapping language, but we did not reach a consensus about whether the mapping languages should have a semantics. To explain what I mean by the semantics of a mapping language, below I give some information about how the problem of data exchange (or data translation) is usually formalized in the database context. In the relational databases context, the data exchange (or data translation) problem is usually formalized as follows. You are given a source relational schema S, a target relational schema T (T could consist of the table Triple for storing RDF triples), and a mapping M that specifies how to translate data from the source into the target [1], and then the problem is to take data structured under the source schema S and creating an instance of the target schema T according to the conditions specified by M. An important issue in this setting is to define a mapping language for expressing mappings like M, which means to define the syntax and semantics of this mapping language: - The syntax of the mapping language is usually defined by considering a syntactic restriction of first-order logic, like source-to-target tuple-generating dependencies (see [1] for the formal definition of these dependencies, which are widely used in this area). - The semantics of the mapping language refers to the following problem: Given a source instance I, a target instance J and a mapping M, is J a valid translation of I according to M? If M is specified by using a set F of first-order logic sentences, then the semantics of the mapping language is given in terms of the semantics for first-order logic: J is a valid translation of I under M if and only if (I,J) satisfies F in the usual first-order logic sense (all these ideas are formalized in [1]). It is important to notice that in the above setting, it could be the case that there exist several possible translations for the same source instance (as M could, for example, create new values in the target), so one has to formally define what is the target instance that reflects the source data as accurately as possible. Once you have done that, you can consider the mapping M as a function that maps each source instance I into the "better" translation of I according to M (this "better" solution is usually the "canonical universal solution" or the "core of the canonical universal solution", which are formally defined in [1,2]). A survey about the tools developed at IBM by following the above approach can be found in [3] (references [1,2,3] can be downloaded from http://www.almaden.ibm.com/cs/people/fagin). In [4], the author shows how the data exchange problem is formalized in logical terms, and what some of the important issues in this area are (this survey can be download from http://www.sigmod.org/publications/sigmod-record/0903/index.html). Finally, there are also two short books where you can find information about the above approach. In [5], it is given a fairly complete picture of the main issues in data exchange, which also includes the case of XML data (it should be noticed that the above approach is also applicable in other data models like XML and RDF). In [6], it is shown how some rule languages (like non-recursive Datalog with equality and safe negation, and some of its extensions) have been used in data integration/exchange. These two short books are available electronically in many libraries. All the best, Marcelo [1] R. Fagin, P. G. Kolaitis, R. J. Miller, L. Popa: Data exchange: semantics and query answering. Theor. Comput. Sci. 336(1): 89-124, 2005. [2] R. Fagin, P. G. Kolaitis, L. Popa: Data exchange: getting to the core. ACM Trans. Database Syst. 30(1): 174-210, 2005. [3] R. Fagin, L. M. Haas, M. A. Hernández, R. J. Miller, L. Popa, Y. Velegrakis: Clio: Schema Mapping Creation and Data Exchange. Conceptual Modeling: Foundations and Applications 2009: 198-236. [4] P. Barcelo: Logical foundations of relational data exchange. SIGMOD Record 38(1): 49-58, 2009. [5] M. Arenas, P. Barcelo, L. Libkin, F. Murlak: Relational and XML Data Exchange Morgan & Claypool Publishers, 2010. [6] M. Genesereth. Data Integration: The Relational Logic Approach. Morgan & Claypool Publishers, 2010.
Received on Friday, 22 October 2010 13:48:14 UTC