- From: Harry Halpin <hhalpin@w3.org>
- Date: Sun, 21 Nov 2010 21:52:32 -0000 (GMT)
- To: "Juan Sequeda" <juanfederico@gmail.com>
- Cc: "public-rdb2rdf-wg@w3.org" <public-rdb2rdf-wg@w3.org>
Just quick note, I'm glad to see this conversation evoked such passion. Note that Soeren Auer, who is hopefully still reading these e-mails, also put forward SPARQL to SQL translation very early on in the life of this WG as a way of making this work. At the time, the WG in general thought that while doing that would be nice, transforming SPARQL into some kind of optimal SQL would like depend on vendor-specific SQL functionality, so the idea was that both transformation of SQL to SPARQL and SPARQL to SQL should remain a point of market competition, and that such transformations could obviously be built on top of a more basic relational data to RDF transform as specified by R2RML. Even if we could spec SPARQL-to-SQL out, one has to rely on some idealization of SQL, and then, the transform it's pretty trivial for reasons already stated - i.e. well-known equivalnce between relational algebra and FOL (datalog). Given that at least we have a publically available for spec for SPARQL, it's a bit easier on the SPARQL side actually. So feel free to write up some formalization and put it in a proof-checker, and then ship the results to an academic database conference (or in your product). Given the state of our schedule (i.e. behind) and given this entire "reverse" mapping conversation came up from a not-too-serious off-the-cuff remark, I'd suggest that we stick to charter and not have this on the agenda. If people do write up formalizations and manage to get them published at academic database conferences, feel free to post them to the list. What *would* be useful at this point would be a strategy for making sure R2RML covers the requirements in the charter, and an pragmatic strategy for getting there as regards both semantics and test-cases. > On Sun, Nov 21, 2010 at 11:17 AM, Alexandre Bertails > <bertails@w3.org>wrote: > >> On Sun, 2010-11-21 at 09:45 -0600, Juan Sequeda wrote: >> > On Sun, Nov 21, 2010 at 9:31 AM, Alexandre Bertails <bertails@w3.org> >> > wrote: >> > [ snip ] >> > > > The relation between RDB, RDF, SPARQL and SQL should be >> > formalized and >> > > > should not rely only on examples. It's called semantics >> > preservation >> > > > -- a well-known problem in the compilation field -- and >> > IMO it should >> > > > be in a normative section. >> > > > >> > > > Eirc and I are totally confident that in the case of the >> > direct >> > > > mapping, we can map any SPARQL query to its equivalent SQL >> > query. >> > > >> > > Where is the proof? >> > >> > >> > heh :-) >> > >> > I think we first need to have a definitive version of RDB + >> > Direct Mapping. >> > >> > >> > I don't think the proof doesn't depend on the "definitive version". >> > Essentially there is a concept of direct mapping a RDB to RDF. The >> > proof must work, regardless the way you represent it (sets, rules). >> >> We don't use sets but more elaborate *data structures* like lists, >> multisets. The Direct Mapping is not a set semantics, it's an algebra >> based on two main data models (RDB and RDF) and a function to bind >> them. >> >> We have motivated this choice by several reasons: >> * real-world relational databases diverged a long time ago from the >> relational algebra. It's a nice theoretical framework but it does >> not reflect the reality (I'm the first one being sad about it) >> * our entry point is RDB. We *need* to access the RDB structure, >> the table names, the column names, the SQL types, etc. The >> relational algebra does not give you any information about that,so >> does not Datalog unless you encode this information in a relation >> but in that case, you have to define the function doing so. >> * we also need access to the data itself. The relational algebra and >> Datalog work on top on sets. We *absolutely need* multisets. Despite >> my repeted questions on that subject, I still don't know how you >> intend to solve that problem. >> >> > The proof is actually simple if you consider the direct mapping rules. >> > Angles and Gutierrez proved that "SPARQL and non-recursive Datalog >> > with negation have the equivalent recursive power, and hence, by >> > classical results, SPARQL is equivalent from an expressive point of >> > view to Relational Algebra". By representing the mapping in datalog >> > rules, one can represent SPARQL as a datalog query, and then use the >> > mapping rules to rewrite the query to relational algebra. >> >> Well, I guess we should ask the SPARQL folks about this specific >> statement :-) If this is true, I suggest that Angles, Gutierrez and >> you start a specification at W3C so the whole community could benefit >> from such a framework. > > >> At least, in rdb2rdf, we don't have to say a word about that. We just >> have to be clean by following the definition of RDF. >> >> > Therefore, thanks to the Angles and Gutierrez result, we don't have to >> > worry about the SPARQL to SQL issue, from the direct mapping >> > perspective. >> >> I believe only what I see. I've worked in the past on proving stuff >> that was true on 20 years old papers but could not resist to the >> formal proof. >> >> That's why I now declare victory only when I see a full embedding on >> the problem, its solution and its proof in a proof checker like >> Coq [1]. >> >> > I'll be publishing the proof done in ACL2 (theorem prover) in a few weeks. > > >> I need at least an implementation if not a constructive proof [2]. >> >> > For R2RML, if that language has more expressivity than >> > datalog, then we might be in trouble. >> >> I thought that was somewhat the plan. Did I miss anything on that >> subject? >> > > I guess so... but what is the guarantee that a SPARQL query can be > translated to SQL? > > >> >> Alexandre. >> >> [1] http://coq.inria.fr/ >> [2] http://en.wikipedia.org/wiki/Constructive_proof >> >> >> >
Received on Sunday, 21 November 2010 21:52:34 UTC