Re: Representing NULL in RDF from Sven R. Kunze on 2013-06-05 (public-lod@w3.org from June 2013)

From: Sven R. Kunze <sven.kunze@informatik.tu-chemnitz.de>
Date: Wed, 05 Jun 2013 15:42:22 +0200
To: public-lod@w3.org
Message-ID: <20130605154222.Horde.haVytJe82EenH03Tx934Jw9@mail.tu-chemnitz.de>
Hi Jan,

some ideas I would like to elaborate to you:

> I was doing some comparison of relational databases and Linked Data  
> and ran into the problem of representing an equivalent of database  
> NULL in RDF.
Interesting starting point as relational databases actually do not  
support your usage scenarios out of the box. A lot of db designers  
often rethink and re-invent the wheel. So, maybe, it's good to discuss  
a default way for that issue in RDF.

> I'm aware of the open world assumption in RDF, but NULL or a missing  
> value can have several interpretations, for example:
As far as I know, there is no such thing as a NULL value in RDF or in  
the datatypes of XML Schema.
rdf:nil is used exclusively for ending rdf:Lists.
A missing value, as Pat and others already pointed out, carries the  
meaning of a missing piece of information. However, it might be  
sensible to introduce new concepts in RDF or related vocabularies to  
represent the domain-agnostic use-cases below.


1)
> - value not applicable (the attribute does not exist or make sense  
> in the context)
Maybe, here is a slight misconception of whether the value or the  
property is not applicable. I for one believe we are talking about the  
relationship between the current instance (subject :s) and a property  
:p. (This is at least one <<context>> I could think of, maybe there  
are more.)

A One could model this piece of information like this:
:s rdfs:inapplicableProperty :p.
(Given that RDFS incorporates inapplicableProperty.)

B Another way of modelling would be
:s :p rdfs:inapplicable.
(Given that RDFS incorporates inapplicable)

I would like to see variant A as the relationship is between the  
current subject and a schema element (the property) and not between  
the subject and a non-existent value.
A schema could even define rdfs:inapplicableProperties for  
constant-instances in order prevent or detect mis-use.


2)
> - value uknown (it should be there but the source doesn't know it)
Actually that piece of information could be written down in a RDF  
Schema graph like this:

#schema
:A a rdfs:Class.
:p a rdf:Property; a rdfs:RequiredProperty; rdfs:domain :A.

#instance
:x a :A; :p :y. # << :x is carries required property
:z a :A. # << :z does not carry required property

Point here is, that instances cannot "decide" whether or not they have  
to carry properties or not. The fact, that :z should carry a :p  
property but doesn't consists of two distince pieces of information:
     - :z should carry :p <<< schema information
     - :z does not carry :p <<< instance information


3)
> - value doesn't exist (e.g. year of death for a person alive)
I am not quite sure whether this variant is a distinct one or either  
falls into 1) or 2).
Maybe that use case is inappropriate to explain what you really mean  
by "doesn't exist".
I tend to say that for such instances of persons they could carry  
rdfs:inapplicableProperty properties for the properties in question.


4)
> - value is witheld (access not allowed)
Interesting case, I'd say. I think here we should talk about access  
granularity. First, I'd like to have a real usage scenario. Second, I  
might have security consideration: it is really necessary that we tell  
an untrusted requester whether we don't have that information or we do  
not want to give it to him? Third, taken that under consideration what  
could a trustworthy requester do with such information?

Besides such considerations, I think in this case we should rethink  
the way we deliver data. It is not really the subject :s which is in a  
relationship :p of an value that signifies "being-withheld", so:
:x :p rdfs:noAccess, rdfs:withheld, ...
doesn't seem appropriate to me.

It is the requester that influences the delivered data, not the subject:
@prefix :sec <...> <<< security namespace

[]  a sec:PropertyWithheld;
        sec:requester <....>;
#    sec:instance :s;  << optional as generally all subjects with that  
particular property can be withheld
        sec:property :p.

----------------------------------------------------------------

What about composite values like in
:foo :aProp [a :nullableValue; rdf:value "value"] ;
            :bProp [a :nullableValue; :reason :notAvailable ].

First of all, I do not really know why we have to merge schema  
information into instance data.
Second, there is a second level of graph traversal, which I preferably  
would like to avoid as it clutters up queries and the triple store.
Third, most of your given examples are schema design issues (there  
might be more examples) can be solved without introducing a clumsy  
composite value.
Forth, a composite values disturbs the data design when there are  
"normal" values which can be expressed via "normal" literals, URIs,  
bnodes. That is, the access queries differ from property to property  
which should be avoided as it complifies(??) a lot.

Cheers,
Sven
Received on Friday, 7 June 2013 06:41:03 UTC