- From: Michael Hausenblas <michael.hausenblas@deri.org>
- Date: Tue, 14 Jun 2011 12:37:31 +0100
- To: Enrico Franconi <franconi@inf.unibz.it>
- Cc: W3C RDB2RDF <public-rdb2rdf-wg@w3.org>
> Fair enough. If you believe so, then the proposal should be the one
> where we give up on NULL values, since it is the only one where
> there is no technical disagreement in the WG :-)
OK. So here is the proposal:
[[
PROPOSAL: To resolve ISSUE-42, the Direct Mapping will include triples
representing the relational schema and will omit triples for NULL
values.
]]
Cheers,
Michael
--
Dr. Michael Hausenblas, Research Fellow
LiDRC - Linked Data Research Centre
DERI - Digital Enterprise Research Institute
NUIG - National University of Ireland, Galway
Ireland, Europe
Tel. +353 91 495730
http://linkeddata.deri.ie/
http://sw-app.org/about.html
On 14 Jun 2011, at 12:24, Enrico Franconi wrote:
> On 14 Jun 2011, at 13:17, Michael Hausenblas <michael.hausenblas@deri.org
> > wrote:
>
>>
>>> In the wiki I came up explicitly with 3 alternative concrete
>>> wordings; please look at them.
>>
>>
>> Looked at them. I need one (1) not three (3).
>>
>>
>>> What I can not do is to solve the open technical problem for the
>>> representation with missing NULLs, since it is hard and complex.
>>
>> That's also my understanding. Hence we can't normatively spec
>> something where even the scientific part is not solved.
>
> Fair enough. If you believe so, then the proposal should be the one
> where we give up on NULL values, since it is the only one where
> there is no technical disagreement in the WG :-)
> I argued that also the proposal with materialised NULLs is
> technically sound, but not everybody in the WG believes so.
> --e.
>
>
>>
>> Cheers,
>> Michael
>> --
>> Dr. Michael Hausenblas, Research Fellow
>> LiDRC - Linked Data Research Centre
>> DERI - Digital Enterprise Research Institute
>> NUIG - National University of Ireland, Galway
>> Ireland, Europe
>> Tel. +353 91 495730
>> http://linkeddata.deri.ie/
>> http://sw-app.org/about.html
>>
>> On 14 Jun 2011, at 12:15, Enrico Franconi wrote:
>>
>>> In the wiki I came up explicitly with 3 alternative concrete
>>> wordings; please look at them.
>>> What I can not do is to solve the open technical problem for the
>>> representation with missing NULLs, since it is hard and complex.
>>> The proposers of this representation should come up with an answer
>>> to this question, so to support their argument. Otherwise only my
>>> proposals can stand.
>>>
>>> On 14 Jun 2011, at 13:07, Michael Hausenblas <michael.hausenblas@deri.org
>>> > wrote:
>>>
>>>>
>>>>> It is ages I'm asking to this WG how to rebuild the correct
>>>>> answers with explicit NULLs from your representation
>>>>
>>>> This is, IMO, the core of the problem. You're asking rather than
>>>> coming up with a concrete wording for the proposal.
>>>>
>>>> Please, for the sake of getting this issue closed and meeting the
>>>> September deadline for LC: Enrico, can you draft a concrete
>>>> wording such as:
>>>>
>>>>
>>>> [[
>>>> PROPOSAL: To resolve ISSUE-42, ...
>>>> ]]
>>>>
>>>>
>>>> that we can discuss and hopefully resolve today?
>>>>
>>>> If we fail to get this done today I'm inclined to change the
>>>> overall timeline because we have a lot of more issues to resolve
>>>> and simply can not afford it to discuss one single issue (no
>>>> matter how important it is) till the cows come home.
>>>>
>>>> This is not a scientific beauty context. We're writing a spec,
>>>> for heavens sake.
>>>>
>>>> Cheers,
>>>> Michael
>>>> --
>>>> Dr. Michael Hausenblas, Research Fellow
>>>> LiDRC - Linked Data Research Centre
>>>> DERI - Digital Enterprise Research Institute
>>>> NUIG - National University of Ireland, Galway
>>>> Ireland, Europe
>>>> Tel. +353 91 495730
>>>> http://linkeddata.deri.ie/
>>>> http://sw-app.org/about.html
>>>>
>>>> On 14 Jun 2011, at 11:44, Enrico Franconi wrote:
>>>>
>>>>> On 13 Jun 2011, at 23:16, Eric Prud'hommeaux wrote:
>>>>>
>>>>>> There is a fundamental difference between SPARQL and SQL users
>>>>>> in that SQL users either prohibit a query from answering with
>>>>>> NULLs:
>>>>>> SELECT name, company
>>>>>> ┌────────────────┐
>>>>>> FROM Conctacts │ name │ company │
>>>>>> WHERE name="Sue"
>>>>>> ├──────┼─────────┤
>>>>>> AND company IS NOT NULL
>>>>>> └──────┴─────────┘
>>>>>> or they write in some application code to skip over the NULLs,
>>>>>> or, pretty commonly, the UI paints an empty string and the
>>>>>> interface user has to guess whether it's was a NULL or a
>>>>>> company named "". The intent of the query in this example was
>>>>>> clearly to get the names of the companies which Sue represents,
>>>>>> for wich neither NULL nor r2rml:NULL nor "" are acceptable
>>>>>> answers.
>>>>>
>>>>> I claim that you can filter out NULLs, exactly like you would do
>>>>> in SQL. On which ground do you claim that applications built on
>>>>> top of RDF data are different from applications built on top a
>>>>> RDB wrt the usage of NULLs? I don't see any evidence that there
>>>>> is such a radical difference to justify your non-standard way in
>>>>> dealing with standard NULLs.
>>>>>
>>>>>> At any rate, I was just arguing that given a tension between
>>>>>> putting burden on the query author to incorporate <code>FILTER
>>>>>> (?company != r2rml:NULL)</code> into the above query, vs.
>>>>>> requiring the person who wants to see the NULL to know the
>>>>>> schema:
>>>>>>
>>>>>> ┌────────────────┐
>>>>>> SELECT * │ who │
>>>>>> company │
>>>>>> WHERE { ?who <Conctacts#name> "Sue"
>>>>>> ├──────┼─────────┤
>>>>>> OPTIONAL { ?who <Conctacts#company> ?company } } │ Sue │
>>>>>> UNBOUND │
>>>>>>
>>>>>> └──────┴─────────┘
>>>>>> , I *think* the rest of the WG is in favor of the the latter
>>>>>> (hence the claim of rough concensus).
>>>>>
>>>>> No, this doesn't work, since you would confuse the answer with a
>>>>> NULL value with the answer with a non existing value. So, the
>>>>> above query doesn't do the job you are declaring. It is ages I'm
>>>>> asking to this WG how to rebuild the correct answers with
>>>>> explicit NULLs from your representation (even with the schema).
>>>>> To no avail.
>>>>> So, please tell me explicitly how do you get the right answer in
>>>>> the above case, with all the details (how the schema is used,
>>>>> how do you distinguish the missing value with the NULL value,
>>>>> how this can be applied mechanically to general queries, etc).
>>>>>
>>>>>>> That's why I am saying "This mapping for NULL values is
>>>>>>> arbitrary since the WG has left unexplored its relationship
>>>>>>> with the original meaning and behaviour of NULL values in the
>>>>>>> source RDB."
>>>>>
>>>>> I can repeat that :-)
>>>>>
>>>>>>> What I am asking you since ages is to go through my three
>>>>>>> examples and see how your proposal would actually encode the
>>>>>>> answers, and show how this would lead to a generic recipe.
>>>>>
>>>>> This request still stands.
>>>>>
>>>>>>> My argument is that this will most likely be possible, but
>>>>>>> that it will be overly complex since it will necessarily
>>>>>>> require the ability to recognise whether a missing value is a
>>>>>>> NULL or not (also in the answer set!).
>>>>>
>>>>> Let's see your answer to my question in bold above.
>>>>>
>>>>>>> Clearly, by having explicit NULL values this problem is
>>>>>>> avoided. Moreover, you can easily switch the the absent-NULL
>>>>>>> representation by just filtering all the tuples with NULL
>>>>>>> values in one simple shot.
>>>>>>
>>>>>> In <http://www.w3.org/2001/sw/rdb2rdf/wiki/RDBNullValues#Comments_and_Proposal_by_Enrico
>>>>>> >, you asked how to discriminate between the direct graphs of
>>>>>> ┌┤R├────────┐ and ┌┤R'├┐
>>>>>> │ ID │ A │ │ ID │
>>>>>> ├────┼──────┤ ├────┤
>>>>>> │ 1 │ NULL │ │ 1 │
>>>>>> └────┴──────┘ └────┘
>>>>>> , but we do that by knowing the schema so the question doesn't
>>>>>> help us learn what is a reasonable mapping.
>>>>>
>>>>> This is too vague: "we do that by knowing the schema". As I said
>>>>> above, please tell how do you proceed explicitly.
>>>>>
>>>>>> I instead propose that you ask questions of the ┤Conctacts├
>>>>>> database above and show how, even knowing the schema, the
>>>>>> direct graph doesn't give you reallistic access to information.
>>>>>> Remember, this isn't a database interchance language, but
>>>>>> instead a way to give RDF users an useful view of relational
>>>>>> data.
>>>>>
>>>>> I don't understand this point :-(
>>>>>
>>>>> cheers
>>>>> --e.
>>>>>
>>>>
>>
Received on Tuesday, 14 June 2011 11:38:05 UTC