Re: Q: ISSUE-41 bNode semantics from Enrico Franconi on 2011-05-18 (public-rdb2rdf-wg@w3.org from May 2011)

From: Enrico Franconi <franconi@inf.unibz.it>
Date: Wed, 18 May 2011 15:20:08 +0200
To: Richard Cyganiak <richard@cyganiak.de>
Cc: Ivan Herman <ivan@w3.org>, Pat Hayes <phayes@ihmc.us>, Michael Hausenblas <michael.hausenblas@deri.org>, W3C RDB2RDF <public-rdb2rdf-wg@w3.org>
Message-Id: <93E9B876-2DA6-44B9-A16E-326E8F0CD3A8@inf.unibz.it>

On 18 May 2011, at 13:28, Richard Cyganiak wrote:

> So unless someone (Ted? Enrico?) can propose a better alternative, I'm still in favour of simply not producing triples for NULLs.

Please let me note first that my arguments are not about "what a NULL value possibly does mean among various possibilities", but they are about "what a NULL value normatively means in the SQL standard".

From a meaningful translation of a RDB in RDF, we should be able also to understand the translation of operations (e.g., SQL queries or updates) over the original data in, say, SPARQL over the translated data. I am not interested in the reverse mapping, but of course I'm interested in how to use correctly the data.

If the original RDB data does not contain nulls, and the direct mapping is employed, then it is sort of obvious how to translate the SQL operations into SPARQL operations: basically it goes through reification of the relational signature into an object model.
However, when NULL values are present, then operations over data (queries, updates) became less obvious.

Examples:

(a) projection over attributes containing NULL values should return the NULL values, different from not returning anything;

(b) a (self-)join fails for tuples with a NULL value in the join attribute;

By not translating NULL values, you fail (a).
By translating NULL values, you fail (b).
(c) is even more complex.

How does SQL solve the matter? By considering a NULL value as a constant, and then tweaking the query answering mechanism letting the join fail whenever this constant is found (see the "three valued semantics").

To mimic this in RDF2RDF, my suggestion would be to translate a NULL value as a special constant from a special datatype, and then we should provide precise directives on how a query language should deal with this. This is how SQL normatively defines the NULL values. Note that this may not be a trivial exercise, due to the complexity of the new SPARQL language, which I understand contains aggregations :-(

cheers
--e.

Received on Wednesday, 18 May 2011 13:20:37 UTC