- From: Alexandre Bertails <bertails@w3.org>
- Date: Sun, 21 Nov 2010 12:17:35 -0500
- To: Juan Sequeda <juanfederico@gmail.com>
- Cc: Eric Prud'hommeaux <eric@w3.org>, Ivan Herman <ivan@w3.org>, "public-rdb2rdf-wg@w3.org" <public-rdb2rdf-wg@w3.org>
On Sun, 2010-11-21 at 09:45 -0600, Juan Sequeda wrote: > On Sun, Nov 21, 2010 at 9:31 AM, Alexandre Bertails <bertails@w3.org> > wrote: > [ snip ] > > > The relation between RDB, RDF, SPARQL and SQL should be > formalized and > > > should not rely only on examples. It's called semantics > preservation > > > -- a well-known problem in the compilation field -- and > IMO it should > > > be in a normative section. > > > > > > Eirc and I are totally confident that in the case of the > direct > > > mapping, we can map any SPARQL query to its equivalent SQL > query. > > > > Where is the proof? > > > heh :-) > > I think we first need to have a definitive version of RDB + > Direct Mapping. > > > I don't think the proof doesn't depend on the "definitive version". > Essentially there is a concept of direct mapping a RDB to RDF. The > proof must work, regardless the way you represent it (sets, rules). We don't use sets but more elaborate *data structures* like lists, multisets. The Direct Mapping is not a set semantics, it's an algebra based on two main data models (RDB and RDF) and a function to bind them. We have motivated this choice by several reasons: * real-world relational databases diverged a long time ago from the relational algebra. It's a nice theoretical framework but it does not reflect the reality (I'm the first one being sad about it) * our entry point is RDB. We *need* to access the RDB structure, the table names, the column names, the SQL types, etc. The relational algebra does not give you any information about that,so does not Datalog unless you encode this information in a relation but in that case, you have to define the function doing so. * we also need access to the data itself. The relational algebra and Datalog work on top on sets. We *absolutely need* multisets. Despite my repeted questions on that subject, I still don't know how you intend to solve that problem. > The proof is actually simple if you consider the direct mapping rules. > Angles and Gutierrez proved that "SPARQL and non-recursive Datalog > with negation have the equivalent recursive power, and hence, by > classical results, SPARQL is equivalent from an expressive point of > view to Relational Algebra". By representing the mapping in datalog > rules, one can represent SPARQL as a datalog query, and then use the > mapping rules to rewrite the query to relational algebra. Well, I guess we should ask the SPARQL folks about this specific statement :-) If this is true, I suggest that Angles, Gutierrez and you start a specification at W3C so the whole community could benefit from such a framework. At least, in rdb2rdf, we don't have to say a word about that. We just have to be clean by following the definition of RDF. > Therefore, thanks to the Angles and Gutierrez result, we don't have to > worry about the SPARQL to SQL issue, from the direct mapping > perspective. I believe only what I see. I've worked in the past on proving stuff that was true on 20 years old papers but could not resist to the formal proof. That's why I now declare victory only when I see a full embedding on the problem, its solution and its proof in a proof checker like Coq [1]. I need at least an implementation if not a constructive proof [2]. > For R2RML, if that language has more expressivity than > datalog, then we might be in trouble. I thought that was somewhat the plan. Did I miss anything on that subject? Alexandre. [1] http://coq.inria.fr/ [2] http://en.wikipedia.org/wiki/Constructive_proof
Received on Sunday, 21 November 2010 17:17:43 UTC