Re: Q: ISSUE-41 bNode semantics

> There is a third way in the case the WG decides to explore a correct
> mapping with NULL values. I propose to translate a NULL value as a special
> constant from a special datatype, and to understand how SPARQL 1.0 queries
> should be modified in order to behave properly in presence of RDF data
> coming from a direct mapping of a RDB with NULL values. My guess is that it
> is enough to enrich the BGP part with a conjunct NOT-EQUAL(X,'NULL') [pardon
> my naive syntax here] for each joined (namely repeated in the BGP) variable
> X, so we remain in pure SPARQL 1.0.
>

I believe that I understand Enrico's point. I think this applies mostly to
ISSUE-42 (Direct Mapping & NULL values) rather than ISSUE-41 (R2RML & NULL
values), although it might make sense to facilitate this approach in R2RML.

One use case for the Direct Mapping is:

1) Create a Direct Mapping for a relational database to produce (either
materialized or virtual) triples that comply with a generated ontology.

2) Define RDF-to-RDF transformation using rules, inferencing, etc to convert
these Direct Mapping triples into domain specific triples that comply with a
domain specific ontology.

In order to perform the RDF-to-RDF transformation _without_ referencing the
original relational databases it is necessary for the Direct Mapping triples
to preserve all of the information from the original relational databases.
This implies that the NULL values from the relational databases need to be
represented in the Direct Mapping triples (e.g. as Enrico described above).
With those NULL values in the Direct Mapping triples then it is possible to
interpret the NULL values as appropriate (in potentially schema specific
ways) for the target domain ontology.

Furthermore with the NULL values present in the Direct Mapping triples it is
possible (albeit cumbersome) to write SPARQL queries that take into account
the SQL semantics of NULL. For example, to check for the RDF NULL in the
queries and handle NULLs as needed for either aggregation or straight
querying.

The NULL value needs to be included for cases like:

ID NAME
100 Joe
200 Bob
300 Sue

ID AGE
100 30
200 NULL

In this example the generated <bob> <age> <NULL> triple is different than
the lack of such a triple for Sue.

-David

Received on Thursday, 19 May 2011 12:10:49 UTC