Re: Proposed Resolution for Issue 42

On 3 Jun 2011, at 15:26, Richard Cyganiak wrote:

> Hi Enrico,
> 
> On 3 Jun 2011, at 07:27, Enrico Franconi wrote:
>> many people don't care to "keep" the meaning of the information of the source relational data. Many people are just happy with a data structure to manipulate, without telling the users how to possibly reconstruct the original RDB meaning from the data structure.
> 
> Well, not quite. As we both know, it is not possible to keep the meaning of the information, because the logic of RDF cannot express that meaning.

As you noticed, I did put "keep" within quotes, since it is explain in the subsequent sentence - so please don't insist on that anymore. Did you ever tell me how to reconstruct the exact behaviour of NULLs in any way? What should users do when they realise that there are NULL values? How do they realise that there are actually NULL values? How can they spot the presence of NULL values in the answer? How do I get an answer with NULL values in it? All these questions are still unanswered. Once I get an answer, I can tell that also your way "keeps" the meaning of the information of the source relational data.

> The question is not about preserving the meaning; the question is about preserving the ability to write every SQL query (including those that do weird shit with NULLs) into a SPARQL query that returns exactly the same result. You insist on being able to do that, and I don't see what that's useful for.

I am not even saying that anymore. My goals so far are more abstract and primitive than that: see above.

> Show me an implementer or user who cares about preserving the SQL semantics of NULLs. Seriously, show me just one.

And this is the point. It is THE fundamental difference between me and you. Since you pretend to know the real world of information systems, the same question should be asked about relational databases. And why should we care? Because nowadays most rdbs contain plenty of NULL values, due to optimisations, denormalisations, data warehousing, bad design, incomplete information, etc. It seems that people do not care since they do not need to care: they just come part of the package, they behave the same in all systems, they just work in an interoperable way, people don't really need to understand them. Who ever thinks about null values? But here they are, plenty of them.
But no, you want that now the world of RDF-based information systems looses this ability, which, by the way, would come for free. And after your proposed standard will be in place, and when complex new applications (web DWs?) will come to place, we will have to change the standard in a non backward compatible way, otherwise the life of the developers of these new sophisticated applications would be a mess.

> I have not seen the slightest shred of evidence that anyone except yourself cares about this. And as far as I can tell (and I apologise if I'm mistaken here) you don't care as an implementer or user, but because of a desire for theoretical purity.

Nobody believes that just arguing about the effects of a choice means "theoretical purity". 

>> Well, I will strongly oppose that.
> 
> And I will strongly oppose any proposal that puts theoretical purity before the concerns of users and implementers.

I would agree on that, if we were talking about theoretical purity, which we are not.

> I acknowledge that theoretical purity is valuable. However, in standardization work it cannot be the driving force.

I would agree on that, if we were talking about theoretical purity, which we are not.

> Letting a standard be driven by that, rather than by a focus on addressing use cases, solving user problems, and alleviating implementation concerns, is a sure way of dooming it to irrelevance. Recent technology history, including W3C's, is full of examples for that.

Indeed. That's why I am insisting on that: "alleviating implementation concerns". Your proposal is making the life harder (or even impossible, unless you are answering the question at the beginning of this message).
Again note: we are talking about having or not having the constant NULL in the graph if NULLs do appear in the source RDB. That's all. And I have an important argument to support my proposal, while from you I just not getting any: only generic rants about me being a theoretician and not knowing the user base. Why you do NOT want to materialise NULLs is still a mistery to me. Which is the advantage of not having materialised NULLS is a mistery to me. At least show me how hard is going to be the life of implementors wrt the questions above.

>> At least, I demand that people who do care should not suffer for the choices of the group.
>> After all, we are just discussing whether a NULL value should be absent or encoded in the translation. I don't understand why we want to make the life difficult to the ones who do care.
> 
> Because we don't want to make life difficult for those who don't care, which means basically everyone.

Aha: so finding a constant NULL where there was actually one in the original RDB makes the life hard to those who don't care? Please, don't tell me so. It is easy to filter them out, if they don't want to see them.

To constructively find a way out, I have a PROPOSAL in the normative part: a flag which does what you want or what I want, *together* with the note saying that your mapping would not be correct as far as NULL values are concerned, while it would be correct in my mapping by just adding the inequality with NULL values (in sparql, in a rule, or whatever you want).

>> If somebody really don't want to see the NULLs, it is always possible to filter them out in a very easy way.
> 
> If someone wants to see them, it's always possible to add them in. I believe that's a single SPARQL CONSTRUCT query.

You can say that once I have the answer to all my questions above.
--e.

Received on Friday, 3 June 2011 14:04:07 UTC