Re: RDF Semantics, non-lean RDF graphs, and redundancy of content from Ossi Nykänen on 2003-12-08 (www-rdf-comments@w3.org from October to December 2003)

From: Ossi Nykänen <onykane@butler.cc.tut.fi>
Date: Mon, 8 Dec 2003 12:30:46 +0200 (EET)
To: pat hayes <phayes@ihmc.us>
Cc: www-rdf-comments@w3.org
Message-ID: <Pine.GSO.4.58.0312081223280.16318@butler.cc.tut.fi>
You are right again. And I don't think that blank nodes (as such) are a
difficult concept.

However, I am talking about two things and haven't been writing my stuff
in too orderly fashion. I'll try to do better in this letter.

The first is the Semantics point of view (which is ok, I'm asking all
these questions since I really do value your comments). The second issue
is using blank nodes in modelling.

I'm afraid I wasn't putting sufficiently effort in my terminology so let
us first consider a concrete example (to calibrate what we were talking
about and close the first case), and then come back to the modelling
issue. (Perhaps not an issue in the W3C way.)

Consider I:

V  = { rdf:type, ex:pat }
IR = LV union {pat27, brian16, is-a, cHuman}
IP = {is-a}
IEXT = { <is-a, { <pat27,cHuman> } > }
IS = { <rdf:type,is-a>, <ex:pat,pat27> }
IL = {}

...and G1:

G1: {
  ex:pat rdf:type ex:human .
  _:x rdf:type ex:
}

Now, to evaluate the truth value of I(G1), we have to establish a mapping
from the blank nodes to IR.

Potential mappings come from the set

A' = { <_:x,pat27>, <_:x,brian16>, <_:x,is-a>, <_:x,cHuman>, ... }

Since there exists <_:x,pat27> in A', we may conclude that I(G1) is true.

This also explains how it is possible to say than I defines no denotation
to _:x. (However, I effectively does select IR, and thus induces A' but
that's irrelevant.)

So there is not wrong in the (notion of the) formal semantics.

---

Then to the second part, about blank nodes in modelling:

I still wonder that if blank nodes are not given explicit types
(in the rdf:type sense) in applications (see e.g. the Primer, figure 6
http://www.w3.org/TR/rdf-primer/#figure6), SW agents might have problems
choosing between valid inferences (when RDF graphs get bigger, merged from
different sources). This inference obviously must take place on top of RDF
Semantics.

Do you know if anyone has studied this? Or does it really matter whether I
use blank nodes or invent a unique URIref for each node that could be
modelled also as a blank node?

Note: In principle, an RDF authoring tool could easily do this for me and
interpretations could just establish "bigger IS sets". Since in a formal
interpretation, a blank node is (existentially) matched against the things
in IR, this seems to have little effect (it shifts the problem into
defining the other sets and mappings in I).

Do you think it is better to model e.g. persons with blank nodes (and do
the identifying indirectly by defining a sufficient amount of properties)
than making naming conventions like exstaff:85740 (parallel and redundant,
in practice)? The RDF Primer seems to think this way(?)

However, it seems that the blank node approach (where "things are
identified by their attributes") might provide fragile applications since
it effectively assumes that the data (identifying attributes) is 100%
correct (a mistyped or changed property will break the denotation and
prohibit integrating data).

In addition, suppose I'm asserting the following:

<#pat> <#child> [ <#age> "4" ] , [ <#age> "3" ].

How would you meaningfully model this in RDF Schema or OWL (write a schema
or an ontology to dictate its use)? And, if one wishes to write neutral
structures in RDF, why not use containers or collections? (Are they too
cumbersome for this ;-)

On the other hand, coming up with unique names (URIrefs) for e.g. persons,
allows combining partial data from multiple sources: non-valid entailments
(e.g. due to dated properties)  "do not drop out the important subjects in
conclusions". (Obviously it is not safe to assume that RDF graphs are
always "up to date".)

This discussion around the second issue is a bit vague (I'm working on
some stuff in order to make it a bit more concrete) and doesn't have to
affect RDF Semantics as such. However, to my opinion, it affects the way
e.g. blank nodes are used in modelling (and thus eventually comes back to
formal semantics as well).

Best regards,

--Ossi
Received on Monday, 8 December 2003 05:30:50 UTC