Re: ISSUE-41 bNode semantics: Information preservation from Alexander De Leon on 2011-05-24 (public-rdb2rdf-wg@w3.org from May 2011)

From: Alexander De Leon <me@alexdeleon.name>
Date: Tue, 24 May 2011 15:24:14 +0200
To: Enrico Franconi <franconi@inf.unibz.it>
Message-Id: <83DE019B-78D0-47C9-9125-2A748F8A4C44@alexdeleon.name>

Hi Enrico, All

I have to admit that may not grasp the concept of information preserving very well, but I don't think that reconstructing the original relation from the RDF is of any importance here. What I think is REALLY important is that we preserve semantics with respect to queries.  In other words:  The answer of a SPARQL query over an RDF graph resulting from a direct mapping should be semantically equivalent to the answer of the same query translated to SQL and executed over the RDB. Under this notion, I believe that we have NOT yet proved that not producing triples for NULLs is INCOMPLETE and INCONSISTENT.

Claim : in RDF, NULL is not a value (i.e. is neither a Resource nor a BNode nor a Literal). Hence, in SPARQL, neither a variable nor a Bnode can bind to NULL. 

Using this claim, translating to SQL the queries (a) and (c) from Enrico's proposal [1] results in : 

(a) SELECT A AS X FROM R WHERE A IS NOT NULL;    (Note that ?X is a variable, hence it cannot be NULL)

(c) (SELECT CONCAT("R/ID=",ID) FROM R) EXCEPT (SELECT CONCAT("R/ID=",ID) FROM R WHERE A IS NOT NULL);  (Note that :bn is BNode, hence it cannot be NULL)

Using this SQL we obtain equivalent results using SPARQL for querying the RDF (without triples for NULL) directly or using the translated SQL on the databases.

[1] http://www.w3.org/2001/sw/rdb2rdf/wiki/RDBNullValues

Cheers,
Alexander De Leon
[ http://www.alexdeleon.name ]

On 24/05/2011, at 14:36, Enrico Franconi wrote:

> 
> On 24 May 2011, at 13:29, Richard Cyganiak wrote:
> 
>> On 24 May 2011, at 12:00, Richard Cyganiak wrote:
>>> Unfortunately, we know that there cannot be a semantics preserving mapping from relational databases to RDF in the presence of NULLs. NULLs in SQL can indicate the absence of a value, and negation cannot be expressed in RDF.
>> 
>> Thinking more about it, this might not be true.
>> 
>> NULL in SQL means: “Either this has an unknown value, or it has no value.”
>> 
>> I'll try to restate that in logical terms.
>> 
>> A(?x, NULL) means: There exists some ?y such that A(?x,?y) or there exists no ?y such that A(?x,?y).
>> 
>> That says exactly nothing. It is trivially true.
>> 
>> So, not creating a triple for NULL is actually semantics-preserving. Or is it? This sounds too simple to be true, so I probably made some stupid mistake.
> 
> It is more complex than this. Even negation is not enough (naively speaking).
> In reality, SQL NULLs are only defined behaviourally by the SQL specs: the 3VL within the WHERE clause, the behaviour of set-baset operators (union/ intersection, etc), the nested queries, etc.
> I am working since a year on a formalisation in model theory of this, which turns out to be not trivial (there is none so far capturing exactly SQL NULLs - all the theoretical works started with the silly assumption "SQL did it wrong, while us... etc etc").
> Indeed the right way to look at it is by the "absence or some value", but note that this is not a tautology, since the absence of value changes the arity of that particular tuple within the relation.
> As soon as we have a draft we trust on, I'll circulate it.
> cheers
> --e.

Attachments

application/pkcs7-signature attachment: smime.p7s

Received on Tuesday, 24 May 2011 13:25:21 UTC