Re: Q: ISSUE-41 bNode semantics

> So unless someone (Ted? Enrico?) can propose a better alternative,  
> I'm still in favour of simply not producing triples for NULLs.

+1

Cheers,
	Michael
--
Dr. Michael Hausenblas, Research Fellow
LiDRC - Linked Data Research Centre
DERI - Digital Enterprise Research Institute
NUIG - National University of Ireland, Galway
Ireland, Europe
Tel. +353 91 495730
http://linkeddata.deri.ie/
http://sw-app.org/about.html

On 18 May 2011, at 12:28, Richard Cyganiak wrote:

> On 18 May 2011, at 05:49, Ivan Herman wrote:
>> I have the *feeling*, based on all this, that the RDF mapping  
>> should not try to force the RDF semantics on this case which seems  
>> to have its own SQL meaning. Ie, my feeling is that options (c) or  
>> (d) below are the right ones, with a slight preference for the  
>> rdb2rdf:NULL.
>
> Thinking more about (c) and (d), I don't think I like them much.
>
> Consider an employee database. The CEO doesn't have a manager and  
> isn't assigned to a specific department:
>
> <Person/1> a <Person> .
> <Person/1> <Person.name> "Alice" .
> <Person/1> <Person.role> "CEO" .
> <Person/1> <Person.manager> rdb2rdf:NULL .
> <Person/1> <Person.department> rdb2rdf:NULL .
>
> Now if someone applies a range delcaration to the <Person.manager>  
> and <Person.department> properties (which seems like a reasonable  
> thing to do, regardless of whether that's part of the direct  
> mapping), then we'd conclude that:
>
> rdb2rdf:NULL a <Person> .
> rdb2rdf:NULL a <Department> .
>
> And I thought these classes were disjoint ...
>
> It gets worse. Let's say the employee database also records the date  
> of birth, but this information is not known for all employees:
>
> <Person/1> <Person.DOB> "1969-11-03"^^xsd:date .
> <Person/2> <Person.DOB> rdb2rdf:NULL .
> <Person/3> <Person.DOB> rdb2rdf:NULL .
>
> Now, it really looks to me like this indicates that Bob and Charlie  
> were *born on the same date*! Which is clearly not the intended  
> semantics.
>
> So I think (c) and (d) are out.
>
> Pat pointed out that option (b), the use of blank nodes, is  
> appropriate in cases like the second, but not in the first case,  
> because Alice really doesn't have a manager or department. This  
> convinces me that (b) is also out.
>
> Which to me leaves option (a), which multiple WG members expressed a  
> preference for to begin with.
>
> So unless someone (Ted? Enrico?) can propose a better alternative,  
> I'm still in favour of simply not producing triples for NULLs.
>
> Best,
> Richard
>
>
>
>>
>> Thanks Richard for that additional explanation, b.t.w., like Pat, I  
>> was not sure what that null value means either...
>>
>> Ivan
>>
>>
>>
>> On May 18, 2011, at 24:01 , Pat Hayes wrote:
>>
>>> Michael & Richard, thanks for this context, which is very useful,  
>>> as I really had no idea what a null value is. I am still not quite  
>>> sure. It indicates that a (real, ie non-null)  value is not  
>>> present, OK. But what does this mean? It could mean simply that  
>>> data is unavailable, or it could mean that for this particular  
>>> individual (denoted by this row) there is no such value. In the  
>>> example case, clearly the former seems correct, but I am sure that  
>>> there are examples where the latter is the appropriate  
>>> interpretation (for example, employees of a company with a column  
>>> for accumulated pension savings, with nulls for employees not in  
>>> the pension plan.) And unfortunately, the answer to your question  
>>> about bnodes is, they would be appropriate in the first case but  
>>> not in the second case. This is because the bnode amounts to an  
>>> existential assertion: to say
>>>
>>> Person/ID=1> <Person#date_of_birth> [] .
>>>
>>> is to say that the date of birth *exists*, i.e. there really is a  
>>> value, even though the particular value is not recorded. Perhaps  
>>> someone who knows more about the use of null values can say  
>>> whether such an existential reading is always appropriate.
>>>
>>> There is no natural way to say that a property *has no value* in  
>>> RDF. The best one could do is probably to invoke RDFS or OWL class  
>>> reasoning, give the property a range, and assert that that the  
>>> value is outside the range class.
>>>
>>> Hope this helps.
>>>
>>> Pat
>>>
>>>
>>> On May 17, 2011, at 1:56 PM, Richard Cyganiak wrote:
>>>
>>>> Pat, Michael,
>>>>
>>>> I'll add a bit of context.
>>>>
>>>> Given a Person table with columns <ID, name, date_of_birth> and a  
>>>> single row <1, Alice, NULL>, the Direct Mapping would currently  
>>>> produce:
>>>>
>>>> <Person/ID=1> a <Person> .
>>>> <Person/ID=1> <Person#name> "Alice" .
>>>>
>>>> The question we are considering is if we should produce an  
>>>> additional triple. Answers that suggest themselves are:
>>>>
>>>> a) No.
>>>>
>>>> b) <Person/ID=1> <Person#date_of_birth> [] .
>>>>
>>>> c) <Person/ID=1> <Person#date_of_birth> rdb2rdf:NULL .
>>>>
>>>> d) <Person/ID=1> <Person#date_of_birth> "NULL"^^rdb2rdf:NULL .
>>>>
>>>> The specific sub-question that we couldn't agree on during the  
>>>> call was whether b) would be consistent with the semantics of  
>>>> blank nodes in RDF.
>>>>
>>>> FWIW, below are some quotes from the SQL spec (specifically, from  
>>>> Part 1 of the 2006 draft of SQL-2008 that I sent around before.)
>>>>
>>>> Best,
>>>> Richard
>>>>
>>>>
>>>>
>>>> DEFINITIONS
>>>>
>>>> null value: A special value that is used to indicate the absence  
>>>> of any data value. (p5)
>>>>
>>>>
>>>> 4.4.2	The null value
>>>>
>>>> Every data type includes a special value, called the null value,  
>>>> sometimes denoted by the keyword NULL. This value differs from  
>>>> other values in the following respects:
>>>>
>>>> Since the null value is in every data type, the data type of the  
>>>> null value implied by the keyword NULL cannot be inferred; hence  
>>>> NULL can be used to denote the null value only in certain  
>>>> contexts, rather than everywhere that a literal is permitted.
>>>>
>>>> Although the null value is neither equal to any other value nor  
>>>> not equal to any other value — it is unknown whether or not it is  
>>>> equal to any given value — in some contexts, multiple null values  
>>>> are treated together; for example, the <group by clause> treats  
>>>> all null values together. (p15)
>>>>
>>>>
>>>>
>>>> On 17 May 2011, at 19:13, Michael Hausenblas wrote:
>>>>
>>>>>
>>>>> Pat,
>>>>>
>>>>> In today's telecon we had a discussion regarding ISSUE-41 [1]  
>>>>> and would appreciate a short advise from your side concerning  
>>>>> the following question:
>>>>> [[
>>>>> Is a blank node an accurate representation of a NULL value from  
>>>>> a relational database?
>>>>> ]]
>>>>>
>>>>> Note that this relates to the Direct Mapping (in R2RML one can  
>>>>> overwrite the behaviour). We have identified options for dealing  
>>>>> with the situation (producing no triple or introducing a bNode  
>>>>> representing the NULL value) and would like to hear your opinion  
>>>>> on the matter.
>>>>>
>>>>> Tracker, this is my ACTION-131.
>>>>>
>>>>>
>>>>> Cheers,
>>>>> 	Michael
>>>>>
>>>>> [1] http://www.w3.org/2001/sw/rdb2rdf/track/issues/41
>>>>>
>>>>> --
>>>>> Dr. Michael Hausenblas, Research Fellow
>>>>> LiDRC - Linked Data Research Centre
>>>>> DERI - Digital Enterprise Research Institute
>>>>> NUIG - National University of Ireland, Galway
>>>>> Ireland, Europe
>>>>> Tel. +353 91 495730
>>>>> http://linkeddata.deri.ie/
>>>>> http://sw-app.org/about.html
>>>>>
>>>>>
>>>>
>>>>
>>>
>>> ------------------------------------------------------------
>>> IHMC                                     (850)434 8903 or (650)494  
>>> 3973
>>> 40 South Alcaniz St.           (850)202 4416   office
>>> Pensacola                            (850)202 4440   fax
>>> FL 32502                              (850)291 0667   mobile
>>> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>> ----
>> Ivan Herman, W3C Semantic Web Activity Lead
>> Home: http://www.w3.org/People/Ivan/
>> mobile: +31-641044153
>> PGP Key: http://www.ivan-herman.net/pgpkey.html
>> FOAF: http://www.ivan-herman.net/foaf.rdf
>>
>>
>>
>>
>>
>>
>

Received on Wednesday, 18 May 2011 11:55:51 UTC