Re: Q: ISSUE-41 bNode semantics

On 18 May 2011, at 05:49, Ivan Herman wrote:
> I have the *feeling*, based on all this, that the RDF mapping should not try to force the RDF semantics on this case which seems to have its own SQL meaning. Ie, my feeling is that options (c) or (d) below are the right ones, with a slight preference for the rdb2rdf:NULL.

Thinking more about (c) and (d), I don't think I like them much.

Consider an employee database. The CEO doesn't have a manager and isn't assigned to a specific department:

<Person/1> a <Person> .
<Person/1> <Person.name> "Alice" .
<Person/1> <Person.role> "CEO" .
<Person/1> <Person.manager> rdb2rdf:NULL .
<Person/1> <Person.department> rdb2rdf:NULL .

Now if someone applies a range delcaration to the <Person.manager> and <Person.department> properties (which seems like a reasonable thing to do, regardless of whether that's part of the direct mapping), then we'd conclude that:

rdb2rdf:NULL a <Person> .
rdb2rdf:NULL a <Department> .

And I thought these classes were disjoint ...

It gets worse. Let's say the employee database also records the date of birth, but this information is not known for all employees:

<Person/1> <Person.DOB> "1969-11-03"^^xsd:date .
<Person/2> <Person.DOB> rdb2rdf:NULL .
<Person/3> <Person.DOB> rdb2rdf:NULL .

Now, it really looks to me like this indicates that Bob and Charlie were *born on the same date*! Which is clearly not the intended semantics.

So I think (c) and (d) are out.

Pat pointed out that option (b), the use of blank nodes, is appropriate in cases like the second, but not in the first case, because Alice really doesn't have a manager or department. This convinces me that (b) is also out.

Which to me leaves option (a), which multiple WG members expressed a preference for to begin with.

So unless someone (Ted? Enrico?) can propose a better alternative, I'm still in favour of simply not producing triples for NULLs.

Best,
Richard



> 
> Thanks Richard for that additional explanation, b.t.w., like Pat, I was not sure what that null value means either...
> 
> Ivan
> 
> 
> 
> On May 18, 2011, at 24:01 , Pat Hayes wrote:
> 
>> Michael & Richard, thanks for this context, which is very useful, as I really had no idea what a null value is. I am still not quite sure. It indicates that a (real, ie non-null)  value is not present, OK. But what does this mean? It could mean simply that data is unavailable, or it could mean that for this particular individual (denoted by this row) there is no such value. In the example case, clearly the former seems correct, but I am sure that there are examples where the latter is the appropriate interpretation (for example, employees of a company with a column for accumulated pension savings, with nulls for employees not in the pension plan.) And unfortunately, the answer to your question about bnodes is, they would be appropriate in the first case but not in the second case. This is because the bnode amounts to an existential assertion: to say
>> 
>> Person/ID=1> <Person#date_of_birth> [] .
>> 
>> is to say that the date of birth *exists*, i.e. there really is a value, even though the particular value is not recorded. Perhaps someone who knows more about the use of null values can say whether such an existential reading is always appropriate.
>> 
>> There is no natural way to say that a property *has no value* in RDF. The best one could do is probably to invoke RDFS or OWL class reasoning, give the property a range, and assert that that the value is outside the range class.
>> 
>> Hope this helps.
>> 
>> Pat
>> 
>> 
>> On May 17, 2011, at 1:56 PM, Richard Cyganiak wrote:
>> 
>>> Pat, Michael,
>>> 
>>> I'll add a bit of context.
>>> 
>>> Given a Person table with columns <ID, name, date_of_birth> and a single row <1, Alice, NULL>, the Direct Mapping would currently produce:
>>> 
>>> <Person/ID=1> a <Person> .
>>> <Person/ID=1> <Person#name> "Alice" .
>>> 
>>> The question we are considering is if we should produce an additional triple. Answers that suggest themselves are:
>>> 
>>> a) No.
>>> 
>>> b) <Person/ID=1> <Person#date_of_birth> [] .
>>> 
>>> c) <Person/ID=1> <Person#date_of_birth> rdb2rdf:NULL .
>>> 
>>> d) <Person/ID=1> <Person#date_of_birth> "NULL"^^rdb2rdf:NULL .
>>> 
>>> The specific sub-question that we couldn't agree on during the call was whether b) would be consistent with the semantics of blank nodes in RDF.
>>> 
>>> FWIW, below are some quotes from the SQL spec (specifically, from Part 1 of the 2006 draft of SQL-2008 that I sent around before.)
>>> 
>>> Best,
>>> Richard
>>> 
>>> 
>>> 
>>> DEFINITIONS
>>> 
>>> null value: A special value that is used to indicate the absence of any data value. (p5)
>>> 
>>> 
>>> 4.4.2	The null value
>>> 
>>> Every data type includes a special value, called the null value, sometimes denoted by the keyword NULL. This value differs from other values in the following respects:
>>> 
>>> Since the null value is in every data type, the data type of the null value implied by the keyword NULL cannot be inferred; hence NULL can be used to denote the null value only in certain contexts, rather than everywhere that a literal is permitted.
>>> 
>>> Although the null value is neither equal to any other value nor not equal to any other value — it is unknown whether or not it is equal to any given value — in some contexts, multiple null values are treated together; for example, the <group by clause> treats all null values together. (p15)
>>> 
>>> 
>>> 
>>> On 17 May 2011, at 19:13, Michael Hausenblas wrote:
>>> 
>>>> 
>>>> Pat,
>>>> 
>>>> In today's telecon we had a discussion regarding ISSUE-41 [1] and would appreciate a short advise from your side concerning the following question:
>>>> [[
>>>> Is a blank node an accurate representation of a NULL value from a relational database?
>>>> ]]
>>>> 
>>>> Note that this relates to the Direct Mapping (in R2RML one can overwrite the behaviour). We have identified options for dealing with the situation (producing no triple or introducing a bNode representing the NULL value) and would like to hear your opinion on the matter.
>>>> 
>>>> Tracker, this is my ACTION-131.
>>>> 
>>>> 
>>>> Cheers,
>>>> 	Michael
>>>> 
>>>> [1] http://www.w3.org/2001/sw/rdb2rdf/track/issues/41
>>>> 
>>>> --
>>>> Dr. Michael Hausenblas, Research Fellow
>>>> LiDRC - Linked Data Research Centre
>>>> DERI - Digital Enterprise Research Institute
>>>> NUIG - National University of Ireland, Galway
>>>> Ireland, Europe
>>>> Tel. +353 91 495730
>>>> http://linkeddata.deri.ie/
>>>> http://sw-app.org/about.html
>>>> 
>>>> 
>>> 
>>> 
>> 
>> ------------------------------------------------------------
>> IHMC                                     (850)434 8903 or (650)494 3973   
>> 40 South Alcaniz St.           (850)202 4416   office
>> Pensacola                            (850)202 4440   fax
>> FL 32502                              (850)291 0667   mobile
>> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>> 
>> 
>> 
>> 
>> 
>> 
> 
> 
> ----
> Ivan Herman, W3C Semantic Web Activity Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> PGP Key: http://www.ivan-herman.net/pgpkey.html
> FOAF: http://www.ivan-herman.net/foaf.rdf
> 
> 
> 
> 
> 
> 

Received on Wednesday, 18 May 2011 11:28:32 UTC