Re: On semantics-based approaches and still using full vendor-specific SQL from Juan Sequeda on 2010-07-23 (public-rdb2rdf-wg@w3.org from July 2010)

From: Juan Sequeda <juanfederico@gmail.com>
Date: Fri, 23 Jul 2010 13:55:48 -0500
To: Harry Halpin <hhalpin@w3.org>
Cc: public-rdb2rdf-wg@w3.org
Message-ID: <AANLkTikwRqy9N5vQM2SZ68yNpcbA3Z30Tz6phD1+TunH@mail.gmail.com>
Harry,

This is a very nice summary. We are coming to a consensus.

Comments in-line


On Fri, Jul 23, 2010 at 1:23 PM, Harry Halpin <hhalpin@w3.org> wrote:

>
> I spent most the day with the Database Research Group here at Edinburgh,
> who kindly managed to read most of the proposals on the table. So, I'm
> going to try to channel the results of the discussion to the group.
>
> One way is to use a purely SQL-based approach (which I hope Souri will be
> present the week after this one) that allows the mapping to be done as a
> view (that is isomorphic to the triples) using the full expressivity of
> SQL. Then a very simple mapping construct can map the results of this SQL
> to a graph, i.e. by generating URIs.
>
> Another way is a purely SQL-based approach, but then expect the mapping
> language to provide a few easy-to-use  basic constructs besides just
> generating URIs in order to do common tasks, i.e. create new nodes etc. I
> think this is the approach that Marcelo and Juan have been advocating for.
>
> Now, I think these two approaches are compatible, as long as the few
> easy-to-use basic constructs can be limited to a sensible amount that can
> be translated into SQL and they do *not* preclude using full-vendor
> specific SQL to create the mapping as well, i.e. in a view.  This makes
> sense, as SQL itself can be viewed using Datalog semantics.
>

I wouldn't say that these two approaches are compatible... they are the same
semantically!

Souri's SQL approach is basically

1) Define a SQL Query
2) the relation itself get's mapped to a Class
3) each attribute is mapped to a property

This is exactly what we have written down in datalog, which establishes the
semantics for this.



>
> Furthermore, people that are SQL wizarde, these basic constructs may
> not be necessary, but some people may find them (particularly people from
> an RDF background) easier to use than doing everything in pure SQL. So,
> Marcelo and Juan's approach this does not necessarily limit the
> expressivity of SQL as long as it does preclude creating a view using full
> vendor-specific SQL  before some basic mapping functions are called.
>
> Lastly, the differences between Eric's RIF-based approach and the Datalog
> approach are negligible in practice, as RIF is essentially also based on
> Datalog semantics, i.e. RIF *is*  a syntax for Datalog (which does not
> have its own syntax) plus some bells and whistles for extensibility.  The
> argument between using Datalog or a set-theoretic semantics for mapping is
> not necessary, as Datalog also has a standard set-theoretic semantics
> (although we do need to get the exact semantics of what we mean by
> "Datalog" down).


Marcelo kindly did get it "down"

http://www.w3.org/2001/sw/rdb2rdf/wiki/Database-Instance-Only_and_Database-Instances-and-Schema_Mapping#Semantics_of_Datalog_programs

<http://www.w3.org/2001/sw/rdb2rdf/wiki/Database-Instance-Only_and_Database-Instances-and-Schema_Mapping#Semantics_of_Datalog_programs>



> Soeren's approach of mapping SPARQL to SQL is also useful,
> and should be used as a test if there is enough time, as it still depends
> on the first possibly non-trivial mapping of relational data to RDF to be
> done (likely non-materialized).
>

Remember that it has been proven that

SQL -> Datalog -> SPARQL
SPARQL -> Datalog -> SQL

So by establishing the mapping semantics in datalog (the semantics... not
the language itself!!!), the SPARQL to SQL issue should be trivial

>
> Would like to hear opinions - just trying to build consensus in the group,
> which despite surface differences, is actually becoming closer I think.
>

I think I'm sounding repetitive, but it might be because I'm not getting my
message across clearly.

I don't see the "datalog approach" as an approach itself. Using datalog is a
way to establish the semantics of the language... and not the language
itself, and that is what Marcelo and I have been doing.

I have been going over D2RQ and the Revelytix Mapping language and realized
(not to my surprise) that the language maps perfectly to the semantics that
we have established in datalog. If I am not wrong, I don't think D2RQ has
defined semantics, so I can't theoretically prove this, so I'm doing it by
example. I'll send out an email when I'm done


>
>        cheers,
>              harry
>
>
>
>
>
>
Received on Friday, 23 July 2010 18:56:23 UTC