Re: representing null in semantic frameworks from Frank Manola on 2007-10-20 (semantic-web@w3.org from October 2007)

From: Frank Manola <fmanola@acm.org>
Date: Sat, 20 Oct 2007 15:26:48 -0400
To: Garret Wilson <garret@globalmentor.com>
Cc: Story Henry <henry.story@bblfish.net>, Semantic Web <semantic-web@w3.org>
Message-Id: <57232581-B84C-43DA-A343-D921551DCA7C@acm.org>
Garret--

A few comments:


On Oct 20, 2007, at 12:40 PM, Garret Wilson wrote:

>
> Right. rdf:nil is an instance of rdf:List that is used to say  
> something like, "the next list of this linked list is really no  
> list at all" (i.e. L rdf:rest rdf:nil; see <http://www.w3.org/TR/ 
> rdf-schema/#ch_nil>).
>
> So in other words RDF created a special null value that is only  
> valid for use with rdf:List. Does anyone know of any meeting  
> minutes or other documents that unveil why the WG didn't create a  
> more general null that you could use for anything?

Not exactly.  rdf:nil doesn't represent "no list at all", it  
represents the empty list (see RDF Semantics).  I'm not trying to  
play with words here.  The empty set is a basic idea in set theory,  
and is a genuine set (not the absence of a set), and this is a  
similar idea applied to lists.  To put it another way, rdf:nil, when  
used as the value of rdf:rest, doesn't say "the rest of this list  
isn't really a list", but rather "the rest of this list is empty" (or  
"there aren't any more members in this list").  It seems to me this  
makes more sense, since it represents a legitimate value of the type  
(lists in this case), rather than adding an "extra-type" value to  
every type (i.e., a general null would mean that, for example, if a  
property is defined as having type integer, then it really takes all  
the integers, plus null).  More below.

>
> Thanks, Henry.
>
> Garret
>
> Story Henry wrote:
>> There is something close. rdf:Lists terminate with a null I think.
>>
>> Henry
>>
>> On 20 Oct 2007, at 18:15, Garret Wilson wrote:
>>
>>>
>>> As RDF evolved, was there any discussion on adding an rdf:null  
>>> resource---that is, a resource that represents no resource at all?

I know what you're getting at, but a resource that represents no  
resource sounds a little odd.  I mean, it *is* a resource, right?  It  
really needs to mean something other than "no resource".  In the  
relational model, the nulls (in some versions there was more than  
one) typically meant something along the lines of "value unknown" or  
"value inapplicable" (there are many possible variants(.  Further on  
this below.

>>>
>>> One expected response: "My child, you're thinking like a  
>>> programmer again---what you really want to do is assert the  
>>> absence of any assertions regarding a particular subject and  
>>> predicate, or you want to assume a closed world and just don't  
>>> assert anything at all", or something like that---and I  
>>> appreciate this point of view to some extent.
>>>
>>> But as a practical matter, let's say we have a list of baseball  
>>> game scores. Wouldn't it be convenient for the resource at index  
>>> 3 to be null to indicate that there was no score that week  
>>> because there was a tornado that canceled the game?
>>>
>>> I'm not necessarily looking for a big online discussion. Just a  
>>> brief pointer to any reading on this subject would help. I'm sure  
>>> there must have been some discussion of null over the development  
>>> history of RDF.


Well, as a practical matter in language design, let's work out our  
use cases a little more carefully :-)  Clearly, a simple null as the  
value of, say, ex:score isn't going to represent "there was no score  
that week because there was a tornado that canceled the game" right?   
That's an awful lot of meaning to stuff into one little null!  So  
first off we're thinking in terms of a separate property or  
properties that describe, say, *why* the score is what it is.  After  
all, if the game were postponed due to the next inning starting  
beyond curfew (happens in Boston anyway) the score might be 5-5 but  
not final (the game would be resumed later), so we'd want to indicate  
that somehow.  On the other hand, considering the game status after  
1/2 inning, the score might be visiting team zero, home team "no  
score" (on the broadcasts they say "coming to bat").  This sort of  
sounds like a null, but it isn't a general one, but rather one  
specialized for the type (this could also be handled with some kind  
of "game status" information).  All the proposed uses of nulls that  
I'm familiar with have similar complexities that come out when you  
look at them more carefully.  My general preference would be to deal  
explicitly with the potential specializations, rather than lumping a  
lot of semantics into a general null.  Again, the question of what  
exactly would this null *mean* has to be answered.

I don't specifically recall a lot of discussion about this in RDF  
Core (someone else might have a better recollection, and perhaps  
there were earlier discussions in developing the 1999 specs).   
However, allow me to point you to the endless discussions (and the  
associated complexity) caused by allowing nulls in the relational  
data model (of which RDF can be considered a specialization).  Simply  
Google "relational null" and have a grand time!  I've had ample  
experience with this issue in that context, and think it's a good  
idea to avoid it here.

--Frank


>>>
>>> Thanks,
>>>
>>> Garret
>>
>
Received on Saturday, 20 October 2007 19:27:07 UTC