Re: Q: ISSUE-41 bNode semantics from Alexandre Bertails on 2011-05-18 (public-rdb2rdf-wg@w3.org from May 2011)

From: Alexandre Bertails <bertails@w3.org>
Date: Wed, 18 May 2011 16:07:56 +0200
To: Enrico Franconi <franconi@inf.unibz.it>
Cc: Richard Cyganiak <richard@cyganiak.de>, Ivan Herman <ivan@w3.org>, Pat Hayes <phayes@ihmc.us>, Michael Hausenblas <michael.hausenblas@deri.org>, W3C RDB2RDF <public-rdb2rdf-wg@w3.org>
Message-ID: <1305727676.2976.163.camel@simplet>
On Wed, 2011-05-18 at 15:41 +0200, Enrico Franconi wrote:
> On 18 May 2011, at 15:31, Alexandre Bertails wrote:
> 
> > Very quickly: RDB is not SQL, it's the subset of SQL called DDL,
> > concerning the data.
> > 
> > We don't deal with SQL queries at all.
> 
> I don't get it. What can you do with the data you have translated into RDF? Does it live in solitude, or should you give at least a hint to the users on how to use it?

A W3C Recommendation™ is an authoritative document that provides clear
and clean information in order to define a standard. The primary goal is
not really to provide documentation (but it's a good thing to do
anyway).

We're actually planning to address your remark by writing an
NOTE-RDB2RDF-Primer that will explain all of that.

> If you choose to deal with NULL values, then you have to take a stance on this. Otherwise just say that you don't deal with NULL values.

All right: we [[ don't deal with NULL values. ]].

> 
> > The reverse mapping is about rewriting SPARQL to SQL so that you target
> > the Direct Graph resulting from the Direct Mapping.
> 
> Uh? Why is this a "reverse mapping"?

Something that is not in scope but that we keep in mind, so that it's
effectively doable. By "reverse mapping", we are speaking about any
technique that would let you access the RDB data without materializing
it. Instead of pulling the data out of the database, we are pushing
queries to this same database.

> 
> What I am saying is that having a RDF2RDF mapping without telling why and how to use it sounds to me bizarre and useless.

The word "mapping" says it all: it gives you a formal definition about
how to see your RDB data as RDF. No more, no less. We're not
standardizing a tool.

This formal definition enables many kind of approaches to deal with your
relational data. For example, you can materialize the data in RDF, and
you can enhance the result using any rule-based technique (SPARQL
CONSTRUCT is already very powerful). You can also considerer that the
RDF dataset is virtual. So in order to access the data, you would expect
to use SPARQL. In that case, you could rewrite the SPARQL query on the
fly to SQL, such that you're querying the actual data. You can pipe that
with a SPARQL2SPARQL rewriting phase to handle your rules.

It's really up to you and we provide a demo of the Direct Mapping +
SPARQL CONSTRUCT at [1].

Alexandre.

[1] http://this-db-really.does-not-exist.org/

> 
> cheers
> --e.
> 
> > 
> > Alexandre.
> > 
> > On Wed, 2011-05-18 at 15:20 +0200, Enrico Franconi wrote:
> >> On 18 May 2011, at 13:28, Richard Cyganiak wrote:
> >> 
> >>> So unless someone (Ted? Enrico?) can propose a better alternative, I'm still in favour of simply not producing triples for NULLs.
> >> 
> >> Please let me note first that my arguments are not about "what a NULL value possibly does mean among various possibilities", but they are about "what a NULL value normatively means in the SQL standard".
> >> 
> >> From a meaningful translation of a RDB in RDF, we should be able also to understand the translation of operations (e.g., SQL queries or updates) over the original data in, say, SPARQL over the translated data. I am not interested in the reverse mapping, but of course I'm interested in how to use correctly the data.
> >> 
> >> If the original RDB data does not contain nulls, and the direct mapping is employed, then it is sort of obvious how to translate the SQL operations into SPARQL operations: basically it goes through reification of the relational signature into an object model. 
> >> However, when NULL values are present, then operations over data (queries, updates) became less obvious. 
> >> 
> >> Examples:
> >> 
> >> (a) projection over attributes containing NULL values should return the NULL values, different from not returning anything;
> >> 
> >> (b) a (self-)join fails for tuples with a NULL value in the join attribute;
> >> 
> >> (c) aggregation, updates, etc.
> >> 
> >> By not translating NULL values, you fail (a).
> >> By translating NULL values, you fail (b).
> >> (c) is even more complex.
> >> 
> >> How does SQL solve the matter? By considering a NULL value as a constant, and then tweaking the query answering mechanism letting the join fail whenever this constant is found (see the "three valued semantics").
> >> 
> >> To mimic this in RDF2RDF, my suggestion would be to translate a NULL value as a special constant from a special datatype, and then we should provide precise directives on how a query language should deal with this. This is how SQL normatively defines the NULL values. Note that this may not be a trivial exercise, due to the complexity of the new SPARQL language, which I understand contains aggregations :-(
> >> 
> >> cheers
> >> --e.
> >> 
> > 
> > 
> > 
> 
>
Received on Wednesday, 18 May 2011 14:08:10 UTC