- From: Michael Hausenblas <michael.hausenblas@deri.org>
- Date: Wed, 18 May 2011 12:55:22 +0100
- To: Richard Cyganiak <richard@cyganiak.de>
- Cc: W3C RDB2RDF <public-rdb2rdf-wg@w3.org>
> So unless someone (Ted? Enrico?) can propose a better alternative, > I'm still in favour of simply not producing triples for NULLs. +1 Cheers, Michael -- Dr. Michael Hausenblas, Research Fellow LiDRC - Linked Data Research Centre DERI - Digital Enterprise Research Institute NUIG - National University of Ireland, Galway Ireland, Europe Tel. +353 91 495730 http://linkeddata.deri.ie/ http://sw-app.org/about.html On 18 May 2011, at 12:28, Richard Cyganiak wrote: > On 18 May 2011, at 05:49, Ivan Herman wrote: >> I have the *feeling*, based on all this, that the RDF mapping >> should not try to force the RDF semantics on this case which seems >> to have its own SQL meaning. Ie, my feeling is that options (c) or >> (d) below are the right ones, with a slight preference for the >> rdb2rdf:NULL. > > Thinking more about (c) and (d), I don't think I like them much. > > Consider an employee database. The CEO doesn't have a manager and > isn't assigned to a specific department: > > <Person/1> a <Person> . > <Person/1> <Person.name> "Alice" . > <Person/1> <Person.role> "CEO" . > <Person/1> <Person.manager> rdb2rdf:NULL . > <Person/1> <Person.department> rdb2rdf:NULL . > > Now if someone applies a range delcaration to the <Person.manager> > and <Person.department> properties (which seems like a reasonable > thing to do, regardless of whether that's part of the direct > mapping), then we'd conclude that: > > rdb2rdf:NULL a <Person> . > rdb2rdf:NULL a <Department> . > > And I thought these classes were disjoint ... > > It gets worse. Let's say the employee database also records the date > of birth, but this information is not known for all employees: > > <Person/1> <Person.DOB> "1969-11-03"^^xsd:date . > <Person/2> <Person.DOB> rdb2rdf:NULL . > <Person/3> <Person.DOB> rdb2rdf:NULL . > > Now, it really looks to me like this indicates that Bob and Charlie > were *born on the same date*! Which is clearly not the intended > semantics. > > So I think (c) and (d) are out. > > Pat pointed out that option (b), the use of blank nodes, is > appropriate in cases like the second, but not in the first case, > because Alice really doesn't have a manager or department. This > convinces me that (b) is also out. > > Which to me leaves option (a), which multiple WG members expressed a > preference for to begin with. > > So unless someone (Ted? Enrico?) can propose a better alternative, > I'm still in favour of simply not producing triples for NULLs. > > Best, > Richard > > > >> >> Thanks Richard for that additional explanation, b.t.w., like Pat, I >> was not sure what that null value means either... >> >> Ivan >> >> >> >> On May 18, 2011, at 24:01 , Pat Hayes wrote: >> >>> Michael & Richard, thanks for this context, which is very useful, >>> as I really had no idea what a null value is. I am still not quite >>> sure. It indicates that a (real, ie non-null) value is not >>> present, OK. But what does this mean? It could mean simply that >>> data is unavailable, or it could mean that for this particular >>> individual (denoted by this row) there is no such value. In the >>> example case, clearly the former seems correct, but I am sure that >>> there are examples where the latter is the appropriate >>> interpretation (for example, employees of a company with a column >>> for accumulated pension savings, with nulls for employees not in >>> the pension plan.) And unfortunately, the answer to your question >>> about bnodes is, they would be appropriate in the first case but >>> not in the second case. This is because the bnode amounts to an >>> existential assertion: to say >>> >>> Person/ID=1> <Person#date_of_birth> [] . >>> >>> is to say that the date of birth *exists*, i.e. there really is a >>> value, even though the particular value is not recorded. Perhaps >>> someone who knows more about the use of null values can say >>> whether such an existential reading is always appropriate. >>> >>> There is no natural way to say that a property *has no value* in >>> RDF. The best one could do is probably to invoke RDFS or OWL class >>> reasoning, give the property a range, and assert that that the >>> value is outside the range class. >>> >>> Hope this helps. >>> >>> Pat >>> >>> >>> On May 17, 2011, at 1:56 PM, Richard Cyganiak wrote: >>> >>>> Pat, Michael, >>>> >>>> I'll add a bit of context. >>>> >>>> Given a Person table with columns <ID, name, date_of_birth> and a >>>> single row <1, Alice, NULL>, the Direct Mapping would currently >>>> produce: >>>> >>>> <Person/ID=1> a <Person> . >>>> <Person/ID=1> <Person#name> "Alice" . >>>> >>>> The question we are considering is if we should produce an >>>> additional triple. Answers that suggest themselves are: >>>> >>>> a) No. >>>> >>>> b) <Person/ID=1> <Person#date_of_birth> [] . >>>> >>>> c) <Person/ID=1> <Person#date_of_birth> rdb2rdf:NULL . >>>> >>>> d) <Person/ID=1> <Person#date_of_birth> "NULL"^^rdb2rdf:NULL . >>>> >>>> The specific sub-question that we couldn't agree on during the >>>> call was whether b) would be consistent with the semantics of >>>> blank nodes in RDF. >>>> >>>> FWIW, below are some quotes from the SQL spec (specifically, from >>>> Part 1 of the 2006 draft of SQL-2008 that I sent around before.) >>>> >>>> Best, >>>> Richard >>>> >>>> >>>> >>>> DEFINITIONS >>>> >>>> null value: A special value that is used to indicate the absence >>>> of any data value. (p5) >>>> >>>> >>>> 4.4.2 The null value >>>> >>>> Every data type includes a special value, called the null value, >>>> sometimes denoted by the keyword NULL. This value differs from >>>> other values in the following respects: >>>> >>>> Since the null value is in every data type, the data type of the >>>> null value implied by the keyword NULL cannot be inferred; hence >>>> NULL can be used to denote the null value only in certain >>>> contexts, rather than everywhere that a literal is permitted. >>>> >>>> Although the null value is neither equal to any other value nor >>>> not equal to any other value — it is unknown whether or not it is >>>> equal to any given value — in some contexts, multiple null values >>>> are treated together; for example, the <group by clause> treats >>>> all null values together. (p15) >>>> >>>> >>>> >>>> On 17 May 2011, at 19:13, Michael Hausenblas wrote: >>>> >>>>> >>>>> Pat, >>>>> >>>>> In today's telecon we had a discussion regarding ISSUE-41 [1] >>>>> and would appreciate a short advise from your side concerning >>>>> the following question: >>>>> [[ >>>>> Is a blank node an accurate representation of a NULL value from >>>>> a relational database? >>>>> ]] >>>>> >>>>> Note that this relates to the Direct Mapping (in R2RML one can >>>>> overwrite the behaviour). We have identified options for dealing >>>>> with the situation (producing no triple or introducing a bNode >>>>> representing the NULL value) and would like to hear your opinion >>>>> on the matter. >>>>> >>>>> Tracker, this is my ACTION-131. >>>>> >>>>> >>>>> Cheers, >>>>> Michael >>>>> >>>>> [1] http://www.w3.org/2001/sw/rdb2rdf/track/issues/41 >>>>> >>>>> -- >>>>> Dr. Michael Hausenblas, Research Fellow >>>>> LiDRC - Linked Data Research Centre >>>>> DERI - Digital Enterprise Research Institute >>>>> NUIG - National University of Ireland, Galway >>>>> Ireland, Europe >>>>> Tel. +353 91 495730 >>>>> http://linkeddata.deri.ie/ >>>>> http://sw-app.org/about.html >>>>> >>>>> >>>> >>>> >>> >>> ------------------------------------------------------------ >>> IHMC (850)434 8903 or (650)494 >>> 3973 >>> 40 South Alcaniz St. (850)202 4416 office >>> Pensacola (850)202 4440 fax >>> FL 32502 (850)291 0667 mobile >>> phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes >>> >>> >>> >>> >>> >>> >> >> >> ---- >> Ivan Herman, W3C Semantic Web Activity Lead >> Home: http://www.w3.org/People/Ivan/ >> mobile: +31-641044153 >> PGP Key: http://www.ivan-herman.net/pgpkey.html >> FOAF: http://www.ivan-herman.net/foaf.rdf >> >> >> >> >> >> >
Received on Wednesday, 18 May 2011 11:55:51 UTC