Re: Proposed Resolution for Issue 42 from Enrico Franconi on 2011-06-02 (public-rdb2rdf-wg@w3.org from June 2011)

From: Enrico Franconi <franconi@inf.unibz.it>
Date: Thu, 2 Jun 2011 10:52:36 +0200
To: Ivan Herman <ivan@w3.org>
Cc: RDB2RDF WG <public-rdb2rdf-wg@w3.org>
Message-Id: <FB4EB00F-F8C4-4A85-A74E-EE97E8F1355C@inf.unibz.it>

On 2 Jun 2011, at 08:57, Ivan Herman wrote:

> On Jun 1, 2011, at 22:15 , Enrico Franconi wrote:
> 
>> I assume that a SPARQL query over the RDB2RDF of a RDB dataset should return something which still can be consistently queried, namely I want an algebra. Even if you don;t want an algebra, you want to be able to understand whether the answer contains NULL values or not. Now, this means that you have to reconstruct the schema of the answer as well, if you want to understand which missing values are NULL values and which ones are just a consequence of the absence of the attribute. I guess that you can reconstruct this information with the CONSTRUCT operator, but then I ask: how can you pretend that a user HAS to understand and correctly use this stuff ALWAYS? 
> 
> In general, I would agree with you. However, the DM case is a little bit different. The whole idea behind the usage of DM is that the RDB is converted into RDF very early on, so that the results can be handled via Semantic Web tools, typically some sort of a rule engine, to convert it into a graph with the right vocabularies and structure for the application. Whether that rule engine is based on SPARQL Construct (like many do) or some RIF implementation, is a detail. In other words, in this case I would not have problems saying that, the user would use some rules to handle the NULL case as well.

The answer is exactly in my paragraph you neglected after your signature:

>> Shouldn't it be better to hide all of this complexity?
>> You see, in your case the presence of the schema information makes the RDF graph not directly meaningful without precise prescriptions. In my case, the presence of NULL values makes the RDF graph not directly meaningful without precise prescriptions. The difference is that in my case writing queries in SPARQL is very easy (just remember to let joins with NULL fail in the BGPs - if you have NULL values to start with), and you can write whatever (meaningful) SPARQL query you want, and you will get the right answer (i.e., the answer you would get in SQL).

"the user would use some rules to handle the NULL case as well": which rules? Should we ask the user to understand the semantics of NULLs in RDBs, so that he/she would write correct "rules"? 
This makes sense in the case of R2RML, where most of the burden/freedom is left to the user, but not in the case of the DM, which tries to be a bit more supportive, I guess.

--e.

Received on Thursday, 2 June 2011 08:53:36 UTC