Re: Agenda for June 14 Telcon - Revision 1 from Michael Hausenblas on 2011-06-14 (public-rdb2rdf-wg@w3.org from June 2011)

From: Michael Hausenblas <michael.hausenblas@deri.org>
Date: Tue, 14 Jun 2011 12:51:27 +0100
To: Enrico Franconi <franconi@inf.unibz.it>
Cc: W3C RDB2RDF <public-rdb2rdf-wg@w3.org>
Message-Id: <97D75806-F703-4F6D-85B0-3D8884CA59DF@deri.org>
> The proposal says that the DM is not applicable to RDBs with NULL  
> values.

I didn't see your proposal, yet.

> Don't restart all the discussion again.

Please let's not go there. As a WG co-chair it is my responsibility to  
ensure progress. If you don't like that, you're more than welcome to  
take over my position, I'll happily resign.

Cheers,
 Michael
--
Dr. Michael Hausenblas, Research Fellow
LiDRC - Linked Data Research Centre
DERI - Digital Enterprise Research Institute
NUIG - National University of Ireland, Galway
Ireland, Europe
Tel. +353 91 495730
http://linkeddata.deri.ie/
http://sw-app.org/about.html

On 14 Jun 2011, at 12:44, Enrico Franconi wrote:

> NO.
> The proposal says that the DM is not applicable to RDBs with NULL  
> values.
> Don't restart all the discussion again.
>
> On 14 Jun 2011, at 13:37, Michael Hausenblas <michael.hausenblas@deri.org 
> > wrote:
>
>>
>>> Fair enough. If you believe so, then the proposal should be the  
>>> one where we give up on NULL values, since it is the only one  
>>> where there is no technical disagreement in the WG :-)
>>
>> OK. So here is the proposal:
>>
>> [[
>> PROPOSAL: To resolve ISSUE-42, the Direct Mapping will include  
>> triples representing the relational schema and will omit triples  
>> for NULL values.
>> ]]
>>
>>
>> Cheers,
>>   Michael
>> --
>> Dr. Michael Hausenblas, Research Fellow
>> LiDRC - Linked Data Research Centre
>> DERI - Digital Enterprise Research Institute
>> NUIG - National University of Ireland, Galway
>> Ireland, Europe
>> Tel. +353 91 495730
>> http://linkeddata.deri.ie/
>> http://sw-app.org/about.html
>>
>> On 14 Jun 2011, at 12:24, Enrico Franconi wrote:
>>
>>> On 14 Jun 2011, at 13:17, Michael Hausenblas <michael.hausenblas@deri.org 
>>> > wrote:
>>>
>>>>
>>>>> In the wiki I came up explicitly with 3 alternative concrete  
>>>>> wordings; please look at them.
>>>>
>>>>
>>>> Looked at them. I need one (1) not three (3).
>>>>
>>>>
>>>>> What I can not do is to solve the open technical problem for the  
>>>>> representation with missing NULLs, since it is hard and complex.
>>>>
>>>> That's also my understanding. Hence we can't normatively spec  
>>>> something where even the scientific part is not solved.
>>>
>>> Fair enough. If you believe so, then the proposal should be the  
>>> one where we give up on NULL values, since it is the only one  
>>> where there is no technical disagreement in the WG :-)
>>> I argued that also the proposal with materialised NULLs is  
>>> technically sound, but not everybody in the WG believes so.
>>> --e.
>>>
>>>
>>>>
>>>> Cheers,
>>>> Michael
>>>> --
>>>> Dr. Michael Hausenblas, Research Fellow
>>>> LiDRC - Linked Data Research Centre
>>>> DERI - Digital Enterprise Research Institute
>>>> NUIG - National University of Ireland, Galway
>>>> Ireland, Europe
>>>> Tel. +353 91 495730
>>>> http://linkeddata.deri.ie/
>>>> http://sw-app.org/about.html
>>>>
>>>> On 14 Jun 2011, at 12:15, Enrico Franconi wrote:
>>>>
>>>>> In the wiki I came up explicitly with 3 alternative concrete  
>>>>> wordings; please look at them.
>>>>> What I can not do is to solve the open technical problem for the  
>>>>> representation with missing NULLs, since it is hard and complex.  
>>>>> The proposers of this representation should come up with an  
>>>>> answer to this question, so to support their argument. Otherwise  
>>>>> only my proposals can stand.
>>>>>
>>>>> On 14 Jun 2011, at 13:07, Michael Hausenblas <michael.hausenblas@deri.org 
>>>>> > wrote:
>>>>>
>>>>>>
>>>>>>> It is ages I'm asking to this WG how to rebuild the correct  
>>>>>>> answers with explicit NULLs from your representation
>>>>>>
>>>>>> This is, IMO, the core of the problem. You're asking rather  
>>>>>> than coming up with a concrete wording for the proposal.
>>>>>>
>>>>>> Please, for the sake of getting this issue closed and meeting  
>>>>>> the September deadline for LC: Enrico, can you draft a concrete  
>>>>>> wording such as:
>>>>>>
>>>>>>
>>>>>> [[
>>>>>> PROPOSAL: To resolve ISSUE-42, ...
>>>>>> ]]
>>>>>>
>>>>>>
>>>>>> that we can discuss and hopefully resolve today?
>>>>>>
>>>>>> If we fail to get this done today I'm inclined to change the  
>>>>>> overall timeline because we have a lot of more issues to  
>>>>>> resolve and simply can not afford it to discuss one single  
>>>>>> issue (no matter how important it is) till the cows come home.
>>>>>>
>>>>>> This is not a scientific beauty context. We're writing a spec,  
>>>>>> for heavens sake.
>>>>>>
>>>>>> Cheers,
>>>>>> Michael
>>>>>> --
>>>>>> Dr. Michael Hausenblas, Research Fellow
>>>>>> LiDRC - Linked Data Research Centre
>>>>>> DERI - Digital Enterprise Research Institute
>>>>>> NUIG - National University of Ireland, Galway
>>>>>> Ireland, Europe
>>>>>> Tel. +353 91 495730
>>>>>> http://linkeddata.deri.ie/
>>>>>> http://sw-app.org/about.html
>>>>>>
>>>>>> On 14 Jun 2011, at 11:44, Enrico Franconi wrote:
>>>>>>
>>>>>>> On 13 Jun 2011, at 23:16, Eric Prud'hommeaux wrote:
>>>>>>>
>>>>>>>> There is a fundamental difference between SPARQL and SQL  
>>>>>>>> users in that SQL users either prohibit a query from  
>>>>>>>> answering with NULLs:
>>>>>>>> SELECT name, company            
>>>>>>>> ┌────────────────┐
>>>>>>>> FROM Conctacts         │ name │ company │
>>>>>>>> WHERE name="Sue"          
>>>>>>>> ├──────┼─────────┤
>>>>>>>> AND company IS NOT NULL      
>>>>>>>> └──────┴─────────┘
>>>>>>>> or they write in some application code to skip over the  
>>>>>>>> NULLs, or, pretty commonly, the UI paints an empty string and  
>>>>>>>> the interface user has to guess whether it's was a NULL or a  
>>>>>>>> company named "". The intent of the query in this example was  
>>>>>>>> clearly to get the names of the companies which Sue  
>>>>>>>> represents, for wich neither NULL nor r2rml:NULL nor "" are  
>>>>>>>> acceptable answers.
>>>>>>>
>>>>>>> I claim that you can filter out NULLs, exactly like you would  
>>>>>>> do in SQL. On which ground do you claim that applications  
>>>>>>> built on top of RDF data are different from applications built  
>>>>>>> on top a RDB wrt the usage of NULLs? I don't see any evidence  
>>>>>>> that there is such a radical difference to justify your non- 
>>>>>>> standard way in dealing with standard NULLs.
>>>>>>>
>>>>>>>> At any rate, I was just arguing that given a tension between  
>>>>>>>> putting burden on the query author to incorporate  
>>>>>>>> <code>FILTER (?company != r2rml:NULL)</code> into the above  
>>>>>>>> query, vs. requiring the person who wants to see the NULL to  
>>>>>>>> know the schema:
>>>>>>>>                                                 
>>>>>>>> ┌────────────────┐
>>>>>>>> SELECT *                                            │  who  
>>>>>>>> │ company │
>>>>>>>> WHERE { ?who <Conctacts#name> "Sue"               
>>>>>>>> ├──────┼─────────┤
>>>>>>>> OPTIONAL { ?who <Conctacts#company> ?company } }   │  Sue  
>>>>>>>> │ UNBOUND │
>>>>>>>>                          
>>>>>>>> └──────┴─────────┘
>>>>>>>> , I *think* the rest of the WG is in favor of the the latter  
>>>>>>>> (hence the claim of rough concensus).
>>>>>>>
>>>>>>> No, this doesn't work, since you would confuse the answer with  
>>>>>>> a NULL value with the answer with a non existing value. So,  
>>>>>>> the above query doesn't do the job you are declaring. It is  
>>>>>>> ages I'm asking to this WG how to rebuild the correct answers  
>>>>>>> with explicit NULLs from your representation (even with the  
>>>>>>> schema). To no avail.
>>>>>>> So, please tell me explicitly how do you get the right answer  
>>>>>>> in the above case, with all the details (how the schema is  
>>>>>>> used, how do you distinguish the missing value with the NULL  
>>>>>>> value, how this can be applied mechanically to general  
>>>>>>> queries, etc).
>>>>>>>
>>>>>>>>> That's why I am saying "This mapping for NULL values is  
>>>>>>>>> arbitrary since the WG has left unexplored its relationship  
>>>>>>>>> with the original meaning and behaviour of NULL values in  
>>>>>>>>> the source RDB."
>>>>>>>
>>>>>>> I can repeat that :-)
>>>>>>>
>>>>>>>>> What I am asking you since ages is to go through my three  
>>>>>>>>> examples and see how your proposal would actually encode the  
>>>>>>>>> answers, and show how this would lead to a generic recipe.
>>>>>>>
>>>>>>> This request still stands.
>>>>>>>
>>>>>>>>> My argument is that this will most likely be possible, but  
>>>>>>>>> that it will be overly complex since it will necessarily  
>>>>>>>>> require the ability to recognise whether a missing value is  
>>>>>>>>> a NULL or not (also in the answer set!).
>>>>>>>
>>>>>>> Let's see your answer to my question in bold above.
>>>>>>>
>>>>>>>>> Clearly, by having explicit NULL values this problem is  
>>>>>>>>> avoided. Moreover, you can easily switch the the absent-NULL  
>>>>>>>>> representation by just filtering all the tuples with NULL  
>>>>>>>>> values in one simple shot.
>>>>>>>>
>>>>>>>> In <http://www.w3.org/2001/sw/rdb2rdf/wiki/RDBNullValues#Comments_and_Proposal_by_Enrico 
>>>>>>>> >, you asked how to discriminate between the direct graphs of
>>>>>>>> ┌┤R├────────┐ and ┌┤R'├┐
>>>>>>>> │ ID │    A │     │ ID │
>>>>>>>> ├────┼──────┤     ├────┤
>>>>>>>> │  1 │ NULL │     │  1 │
>>>>>>>> └────┴──────┘     └────┘
>>>>>>>> , but we do that by knowing the schema so the question  
>>>>>>>> doesn't help us learn what is a reasonable mapping.
>>>>>>>
>>>>>>> This is too vague: "we do that by knowing the schema". As I  
>>>>>>> said above, please tell how do you proceed explicitly.
>>>>>>>
>>>>>>>> I instead propose that you ask questions of the  
>>>>>>>> ┤Conctacts├ database above and show how, even knowing the  
>>>>>>>> schema, the direct graph doesn't give you reallistic access  
>>>>>>>> to information. Remember, this isn't a database interchance  
>>>>>>>> language, but instead a way to give RDF users an useful view  
>>>>>>>> of relational data.
>>>>>>>
>>>>>>> I don't understand this point :-(
>>>>>>>
>>>>>>> cheers
>>>>>>> --e.
>>>>>>>
>>>>>>
>>>>
>>
Received on Tuesday, 14 June 2011 11:51:59 UTC