Re: Reverse Mapping RDF2RDF

* Alexandre Bertails <bertails@w3.org> [2010-11-20 10:23-0500]
> On Sat, 2010-11-20 at 07:08 -0500, Eric Prud'hommeaux wrote:
> > * Ivan Herman <ivan@w3.org> [2010-11-20 11:24+0100]
> > > I wonder where this agenda item comes from... I guess it was triggered by my question to Michael at the SWCG call.
> > > 
> > > I did not have anything very complicated in mind, just a question... we are talking about the possibility of translating SPARQL queries to the SQL calls on-the-fly.
> > 
> > Ahh, to me "reverse mapping" denotes a mapping from RDF graphs to RDB tables (which has its own use cases and advocates but is outside this charter). As to exploiting an rdb2rdf mapping as a virtual view, I believe that it is what every one of us has done and aims to do interoperably. There were a series of presentations at the beginning of the WG's life 
> >   http://www.w3.org/2001/sw/rdb2rdf/wiki/Initial_Round_of_Presentations
> > which showcased mostly the different strategies for sparql2sql. No one as yet has demonstrated sql2sparql.
> > 
> > For consistency, we could adopt some vocabulary:
> >   rdb2rdf: mapping relational data to rdf graphs.
> >   rdf2rdb: mapping rdf graphs to relational data.
> >   sparql2sql: mapping sparql queries to sql queries.
> >   etc.
> 
> +1
> 
> > I believe we see rdb2rdf as in scope, rdf2rdb as out of scope but if someone out there wants to tool on it and advise us, great. I expect we'll all have our strategies for sparql2sql, and that while it's not up to RDB2RDF to mandate one, it is our job to make sure that our rdb2rdf mapping language enable sparql2sql. 
> 
> We have to ensure that any rdb2rdf solution (direct mapping, datalog
> rules, r2rml) can enable sparql2sql.
> 
> > >   I was just wondering whether there was a systematic consideration whether that is possible at all if I write an R2RML or use the direct mapping; if not, under which circumstances, and whether this is something that the author of an R2RML instance can influence. I saw in the inverseExpression term in R2RML; is that enough for what I meant?
> > > 
> > > Maybe some sort of primer text should include more information on that.
> > 
> > There's some commented text in both directMapping and UC&R showing "equivalence" of SPARQL and SQL queries over the leading examples in those documents.
> 
> The relation between RDB, RDF, SPARQL and SQL should be formalized and
> should not rely only on examples. It's called semantics preservation
> -- a well-known problem in the compilation field -- and IMO it should
> be in a normative section.
> 
> Eric and I are totally confident that in the case of the direct
> mapping, we can map any SPARQL query to its equivalent SQL query.
> 
> > I scare-quote equivalence because SPARQL returns RDF terms and SQL, unless you significantly change the wire protocol, returns strings which can be parsed to RDF terms. 
> 
> I don't understand, a SQL result is a SQL table, not a bunch of
> strings to be parsed. It's perfectly defined (the same way than for
> the relation algebra).

Right, it returns a table of SQL terms, which SPARQL returns a (well,
table really) of RDF terms. The mapping between then isn't automagic.
If I try to draw two equivalent queries for directMapping example 1:
┌────────────────────────────────────────────────────┐┌──────────────────────────────────────────────────────┐
│PREFIX People: <http://foo.example/DB/People#>      ││-- SQL capturing the SPARQL query's graph constraints.│
│PREFIX Addresses: <http://foo.example/DB/Addresses#>││SELECT People.fname AS name, Addresses.city           │
│SELECT ?name ?city WHERE {                          ││  FROM People                                         │
│    ?who People:fname ?name .                       ││  JOIN Addresses                                      │
│    ?who People:addr ?address .                     ││       ON Addresses.ID=People.addr                    │
│    ?address People:city ?city .                    ││ WHERE People.fname IS NOT NULL                       │
│ }                                                  ││   AND Addresses.city IS NOT NULL                     │
└────────────────────────────────────────────────────┘└──────────────────────────────────────────────────────┘
, SPARQL returns RDF terms  while SQL returns SQL terms ┌───────┬─────────────┐
  ?name     ?city                                       │ name  │ city        │
  "Bob"  "Cambridge"     │ "Bob" │ "Cambridge" │
                         └───────┴─────────────┘

Of course, the SPARQL results above aren't standard (or even parsable);
we'd probably want to use SPARQL Query Results XML Format:

  <sparql xmlns="http://www.w3.org/2005/sparql-results#">
    <head><variable name="name"/><variable name="city"/></head>
    <results>
      <result><binding name="name"><literal>Bob</literal></binding>
              <binding name="name"><literal>Cambridge</literal></binding>
  </result></results></sparql>

So the queries aren't exactly equivalent, but we're already defining a
mapping SQL2XSD from SQL terms to RDF terms. If we wanted to go there,
we could lean on XQuery (either semantics notation or surface syntax)
to define a mapping sqltable2sparqlres from an SQL result (a table) to
SPARQL results:

  sqltable2sparqlres(T) = <sparql>(vars(T) + <results>[ sqlrow2sparqlres(ROW) ∣ ROW ∈ T ]</results>)</sparql>
    A SPARQL Results of the variables in T plus the rows expressed as result elements.
  vars(T) = <head>[<variable name="COLUMN name"¹/> | COLUMN ∈ T]</head>
    A variable declaration for each column name in T.
  sqlrow2sparqlres(ROW) = <result>[sqlcell2binding(CELL) ∣ CELL != NULL, CELL ∈ ROW]²</result>
    A SPARQL result for each non-NULL cell in the row.
  sqlcell2binding(CELL) = <binding name="CELL name"><literal datatype="SQL2XSD(CELL type)">CELL value</literal></binding>
    A binding element with the name, datatype and value of the cell.

¹ using XQuery surface syntax, <foo bar="f(x)"/> doesn't mean a bar
  attribute with a value of "f(x)", but instead a value which is the
  evaluation of f(x).

² sqlrow2sparqlreq could emet a *set* of cell bindings as well. lists
  are cheaper, sets are truer.

That's cool for literals, but bnodes and IRIs (suppose I selected
?who) require intimate knowledge of state accumulated during
sparql2sql, which is, I believe, the domain of implementations. 


> Alexandre.
> 
> > 
> > > That is all...
> > > 
> > > Ivan
> > > 
> > > On Nov 19, 2010, at 20:20 , Juan Sequeda wrote:
> > > 
> > > > Hi Ivan,
> > > > 
> > > > Per the agenda, it states:
> > > > 
> > > > 4. Reverse Mapping
> > > > Question from Ivan re RDF2RDB (it's not in our charter, but maybe some WG
> > > > members plan to address this?)
> > > > 
> > > > I'm curious about RDF2RDB. Could you expand on this. What are the use cases? Requirements?
> > > > 
> > > > Thanks
> > > > 
> > > > Juan Sequeda
> > > > +1-575-SEQ-UEDA
> > > > www.juansequeda.com
> > > 
> > > 
> > > ----
> > > Ivan Herman, W3C Semantic Web Activity Lead
> > > Home: http://www.w3.org/People/Ivan/
> > > mobile: +31-641044153
> > > PGP Key: http://www.ivan-herman.net/pgpkey.html
> > > FOAF: http://www.ivan-herman.net/foaf.rdf
> > > 
> > > 
> > > 
> > > 
> > > 
> > 
> > 
> > 
> 
> 
> 
> 

-- 
-ericP

Received on Saturday, 20 November 2010 17:40:22 UTC