Re: Proposed Resolution for Issue 42

On 31 May 2011, at 22:30, Richard Cyganiak wrote:

>> can I ask you to produce a real example involving the use of rdfs:domain?
> 
> Table T with columns col1,col2,col3:
> 
> <T#col1> rdfs:domain <T>.
> <T#col2> rdfs:domain <T>.
> <T#col3> rdfs:domain <T>.

Sorry for my silence, I needed to think :-)
The example above sounds convincing as far as information loss is concerned.
If you insist going this route, then we need to be convinced that it delivers really what it promises.
Even before starting to consider "SQL completeness" (i.e., what Alexander was talking abot in your last meeting), let's focus on the data model and the query algebra.
I assume that a SPARQL query over the RDB2RDF of a RDB dataset should return something which still can be consistently queried, namely I want an algebra. Even if you don;t want an algebra, you want to be able to understand whether the answer contains NULL values or not. Now, this means that you have to reconstruct the schema of the answer as well, if you want to understand which missing values are NULL values and which ones are just a consequence of the absence of the attribute. I guess that you can reconstruct this information with the CONSTRUCT operator, but then I ask: how can you pretend that a user HAS to understand and correctly use this stuff ALWAYS? Shouldn't it be better to hide all of this complexity?
You see, in your case the presence of the schema information makes the RDF graph not directly meaningful without precise prescriptions. In my case, the presence of NULL values makes the RDF graph not directly meaningful without precise prescriptions. The difference is that in my case writing queries in SPARQL is very easy (just remember to let joins with NULL fail in the BGPs - if you have NULL values to start with), and you can write whatever (meaningful) SPARQL query you want, and you will get the right answer (i.e., the answer you would get in SQL).

--e.

Received on Wednesday, 1 June 2011 20:16:18 UTC