Re: Reverse Mapping RDF2RDF from Juan Sequeda on 2010-11-21 (public-rdb2rdf-wg@w3.org from November 2010)

From: Juan Sequeda <juanfederico@gmail.com>
Date: Sun, 21 Nov 2010 16:01:02 -0600
To: Harry Halpin <hhalpin@w3.org>
Cc: "public-rdb2rdf-wg@w3.org" <public-rdb2rdf-wg@w3.org>
Message-ID: <AANLkTimrOd8M2RgJ-KMj0_gacHdA9XosAD+Xyz-omFAa@mail.gmail.com>
+1

Juan Sequeda
+1-575-SEQ-UEDA
www.juansequeda.com


On Sun, Nov 21, 2010 at 3:52 PM, Harry Halpin <hhalpin@w3.org> wrote:

> Just quick note,
>
>   I'm glad to see this conversation evoked such passion. Note that Soeren
> Auer, who is hopefully still reading these e-mails, also put forward
> SPARQL to SQL translation very early on in the life of this WG as a way
> of making this work. At the time, the WG in general thought that while
> doing that would be nice, transforming SPARQL into some kind of optimal
> SQL would like depend on vendor-specific SQL functionality, so the idea
> was that both transformation of SQL to SPARQL and SPARQL to SQL should
> remain a point of market competition, and that such transformations
> could obviously be built on top of a more basic relational data to RDF
> transform as specified by R2RML.
>
> Even if we could spec SPARQL-to-SQL out, one has to rely on some
> idealization of SQL, and then, the transform it's pretty trivial for
> reasons already stated - i.e. well-known equivalnce between relational
> algebra and FOL (datalog). Given that at least we have a publically
> available for spec for SPARQL, it's a bit easier on the SPARQL side
> actually. So feel free to write up some formalization and put it in a
> proof-checker, and then ship the results to an academic database
> conference (or in your product).
>
> Given the state of our schedule (i.e. behind) and given this entire
> "reverse" mapping conversation came up from a not-too-serious off-the-cuff
> remark, I'd suggest that we stick to charter and not have this on the
> agenda. If people do write up formalizations and manage to get them
> published at academic database conferences, feel free to post them to the
> list.
>
> What *would* be useful at this point would be a strategy for making sure
> R2RML covers the requirements in the charter, and an pragmatic strategy
> for getting there as regards both semantics and test-cases.
>
> > On Sun, Nov 21, 2010 at 11:17 AM, Alexandre Bertails
> > <bertails@w3.org>wrote:
> >
> >> On Sun, 2010-11-21 at 09:45 -0600, Juan Sequeda wrote:
> >> > On Sun, Nov 21, 2010 at 9:31 AM, Alexandre Bertails <bertails@w3.org>
> >> > wrote:
> >> >         [ snip ]
> >> >         > > The relation between RDB, RDF, SPARQL and SQL should be
> >> >         formalized and
> >> >         > > should not rely only on examples. It's called semantics
> >> >         preservation
> >> >         > > -- a well-known problem in the compilation field -- and
> >> >         IMO it should
> >> >         > > be in a normative section.
> >> >         > >
> >> >         > > Eirc and I are totally confident that in the case of the
> >> >         direct
> >> >         > > mapping, we can map any SPARQL query to its equivalent SQL
> >> >         query.
> >> >         >
> >> >         > Where is the proof?
> >> >
> >> >
> >> >         heh :-)
> >> >
> >> >         I think we first need to have a definitive version of RDB +
> >> >         Direct Mapping.
> >> >
> >> >
> >> > I don't think the proof doesn't depend on the "definitive version".
> >> > Essentially there is a concept of direct mapping a RDB to RDF. The
> >> > proof must work, regardless the way you represent it (sets, rules).
> >>
> >> We don't use sets but more elaborate *data structures* like lists,
> >> multisets. The Direct Mapping is not a set semantics, it's an algebra
> >> based on two main data models (RDB and RDF) and a function to bind
> >> them.
> >>
> >> We have motivated this choice by several reasons:
> >> * real-world relational databases diverged a long time ago from the
> >>  relational algebra. It's a nice theoretical framework but it does
> >>  not reflect the reality (I'm the first one being sad about it)
> >> * our entry point is RDB. We *need* to access the RDB structure,
> >>  the table names, the column names, the SQL types, etc. The
> >>  relational algebra does not give you any information about that,so
> >>  does not Datalog unless you encode this information in a relation
> >>  but in that case, you have to define the function doing so.
> >> * we also need access to the data itself. The relational algebra and
> >>  Datalog work on top on sets. We *absolutely need* multisets. Despite
> >>  my repeted questions on that subject, I still don't know how you
> >>  intend to solve that problem.
> >>
> >> > The proof is actually simple if you consider the direct mapping rules.
> >> > Angles and Gutierrez proved that "SPARQL and non-recursive Datalog
> >> > with negation have the equivalent recursive power, and hence, by
> >> > classical results, SPARQL is equivalent from an expressive point of
> >> > view to Relational Algebra". By representing the mapping in datalog
> >> > rules, one can represent SPARQL as a datalog query, and then use the
> >> > mapping rules to rewrite the query to relational algebra.
> >>
> >> Well, I guess we should ask the SPARQL folks about this specific
> >> statement :-) If this is true, I suggest that Angles, Gutierrez and
> >> you start a specification at W3C so the whole community could benefit
> >> from such a framework.
> >
> >
> >> At least, in rdb2rdf, we don't have to say a word about that. We just
> >> have to be clean by following the definition of RDF.
> >>
> >> > Therefore, thanks to the Angles and Gutierrez result, we don't have to
> >> > worry about the SPARQL to SQL issue, from the direct mapping
> >> > perspective.
> >>
> >> I believe only what I see. I've worked in the past on proving stuff
> >> that was true on 20 years old papers but could not resist to the
> >> formal proof.
> >>
> >> That's why I now declare victory only when I see a full embedding on
> >> the problem, its solution and its proof in a proof checker like
> >> Coq [1].
> >>
> >>
> > I'll be publishing the proof done in ACL2 (theorem prover) in a few
> weeks.
> >
> >
> >> I need at least an implementation if not a constructive proof [2].
> >>
> >> > For R2RML, if that language has more expressivity than
> >> > datalog, then we might be in trouble.
> >>
> >> I thought that was somewhat the plan. Did I miss anything on that
> >> subject?
> >>
> >
> > I guess so... but what is the guarantee that a SPARQL query can be
> > translated to SQL?
> >
> >
> >>
> >> Alexandre.
> >>
> >> [1] http://coq.inria.fr/
> >> [2] http://en.wikipedia.org/wiki/Constructive_proof
> >>
> >>
> >>
> >
>
>
Received on Sunday, 21 November 2010 22:02:01 UTC