Re: Q: ISSUE-42 bNode semantics from Richard Cyganiak on 2011-05-23 (public-rdb2rdf-wg@w3.org from May 2011)

From: Richard Cyganiak <richard@cyganiak.de>
Date: Mon, 23 May 2011 12:14:49 +0100
To: Enrico Franconi <franconi@inf.unibz.it>
Cc: W3C RDB2RDF <public-rdb2rdf-wg@w3.org>
Message-Id: <6D00B88D-EE4B-4397-A946-896F33088CFB@cyganiak.de>
On 23 May 2011, at 08:03, Enrico Franconi wrote:

> On 23 May 2011, at 07:51, Richard Cyganiak wrote:
> 
>> 1. The use cases for the direct mapping call for SPARQL-to-SQL translation, not for SQL-to-SPARQL translation. All known implementations of direct mappings use it to translate SPARQL to SQL, not the other way round. The literature in this area has focused on SPARQL-to-SQL translation, not the other way round. Would you agree that SPARQL-to-SQL translation is more useful and appropriate than SQL-to-SPARQL translation for showing mapping correctness, and if not then why not?
> 
> In this thread we are discussing about how to encode SQL NULLs in RDF in the context of translating RDBs into RDF graphs. I thought we already understood that in this context it is crucial that we study a translation of the SQL queries into RDF queries together with the translation of SQL tables into RDF graphs. Why is that? The reason is that only by understanding such query translation it is possible to understand how to properly (i.e., corectly) access the information stored as RDF graphs; indeed, we have shown that a naive way of querying with SPARQL the RDF graph will not give you the right answer in presence of NULL values. And how can a user write a non-naive correct SPARQL query? By understanding how in general the corresponding naive queries written in SQL (this is the more or less standard SPARQL-to-SQL transformation) can be correctly represented in SPARQL, so that we can then devise a general rule on how to transform naive SPARQL queries to correct SPARQL queries (this is the SQL-to-SPARQL mapping which parallels the RDB to RDF mapping). Of course, at the end I imagine an automatic mechanism which transform a naive SPARQL query in to a correct one; but this mechanism relies on us to understand how to implement it.

You did not quite answer my question. You say here that only translating any SQL queries to SPARQL can show correctness of the direct mapping. I ask why translating any SPARQL query to SQL (which is a well-studied problem) is not sufficient.

>> 2. It is easy to translate SPARQL BGP matching to SQL in a way that preserves results over the null-ignoring direct mapping.
> 
> yes, we already discussed that in my previous email. It is just BGPs.
> 
>> Furthermore, the peer-reviewed literature contains translations of all SPARQL 1.0 algebra operators to SQL.
> 
> I said in my previous email that this does not really help in this specific thread, since we are discussing about NULLs with SQL semantics, and SPARQL does not provide any help about that.

See above. I claim that for practical purposes, showing a general translation from SPARQL to SQL is sufficient to show correctness of a direct mapping, and the literature contains such translations.

>> Why do you doubt the existence of a translation from SPARQL 1.0 to SQL that preserves results over the null-ignoring direct mapping?
> 
> You simply fail to tell me how to find such a translation, so we can not presume its existence.

You can find such a translation by following the steps in Chebotko et al.

> I showed an example (the one with MINUS) where a naive translation does not work, and I still have to see from your side a general way to fix it.

MINUS does not exist in SPARQL 1.0. You say that it can be expressed in SPARQL 1.0. Can you tell me how? Then I'll show how this would be translated to SQL according to Chebotko et al.

> Maybe it exists, but I don't see technically how can we proceed in finding it in its generality and prove its correctness, while I did provide a way in the case of materialised NULLs. I have nothing against it, but I need that the proposers of this alternative show us how to proceed.

I gave a reference to a peer-reviewed journal paper that contains general translations of SPARQL relational algebra operators to SQL. I don't know what more you expect me to do here!

>> 3. Two graphs that entail each other, and hence have the same meaning, can produce different results when SPARQL queries are evaluated against them. In light of this, how can you suggest the comparison of query answering results as appropriate to establish the soundness of operations over RDF?
> 
> This question is independent on dealing with NULLs.

Right.

> Did the WG study this property of the current proposed mapping? So the question is: in absence of NULLs, do equivalent RDF graphs give the same answer to the same query using the currently proposed mapping (modulo bnodes renaming)? I am quite sure that this is true due to the very particular nature of the RDF graphs obtained by the mapping. This has to be true since the RDB we start from is just plain *ground* information, and there is only one way to correctly represent it directly or in a reified way (modulo OID renaming).

This is a good point and I think you are right -- in the absence of NULLs, equivalent direct mapping graphs give the same answer to the same query, if one assumes the graph to be lean, and assumes that the SPARQL query is evaluated under the same entailment regime that one considers for the equivalence of the graphs.

So let me ask a question with a weaker claim:

Let's assume G entails G'. Hence, G captures all the meaning of G'. Nevertheless, queries can produce completely different results when SPARQL queries are evaluated against the two graphs. In light of this, how can you suggest the comparison of query answering results as appropriate to establish the soundness of operations over RDF?

> So, it is a question to the whole WG - I did not follow carefully all the discussions the WG had so far, I just kicked in to clarify the matter on NULL values.

The question of correctness of the mapping did not come up so far, because the correctness is fairly obvious as long as one doesn't consider NULL, and we did not formally consider NULLs until now.

Thanks,
Richard
Received on Monday, 23 May 2011 11:15:27 UTC