Re: Fear for explicit NULL values

On 13 Jun 2011, at 23:23, Eric Prud'hommeaux wrote:

> * Enrico Franconi <franconi@inf.unibz.it> [2011-06-13 22:05+0200]
>> I have the impression that people are considering the presence of explicit NULL values in the data and in the answers as "polluting". In RDBs NULLs are everywhere, in the data and in the answers, since day one. You don't have an option not to see them in the data or in the answer. They are just there, and they have a specific meaning and behaviour (which is the same in Oracle, M$-SQL-server, etc). Why in mapping RDBs to RDF graphs you want to hide them as if the are bearing a chronic disease? And by doing that, why you want to hamper the possibility to keep in the RDF graph the same behaviour (and meaning) NULLs had in the original RDB?
> 
> RDBs fundamentally models missing information as NULLs in the field where that information would appear. RDF fundamentally models missing information as a lack of assertions. To do otherwise yields graphs which require explicit knowledge of e.g. rdb2rdf:NULL, which is unknown to the rest of the RDF world. They would ask questions over these results as generic RDF graphs and our rdb2rdf-specific answers would essentially be lying to them by saying that Sue was assigned to some company called rdb2rdf:NULL. 

This is simply false.
If you are a good designer, and you want to represent *missing* information in RDBs, you do exactly the same as in RDF: the normal form would require that the attributes with missing information become independent relations (aha: properties), with a foreign key from the primary key of the original relation (aha: the rdf:type). So, nothing new or different in RDF wrt RDBs in this respect. 
NULL values in RDBs are modelling something different from just missing values. Indeed, the model the ambiguous case of missing information or unknown information.
You don't have NULL values in RDF, and that's the source of our discussion. You want to preserve the fact that the source RDB had a NULL value (as such) and not a missing value.

--e.

Received on Tuesday, 14 June 2011 10:51:06 UTC