Re: RDF Semantics, non-lean RDF graphs, and redundancy of content from pat hayes on 2003-12-08 (www-rdf-comments@w3.org from October to December 2003)

From: pat hayes <phayes@ihmc.us>
Date: Mon, 8 Dec 2003 09:51:30 -0600
To: Ossi Nykänen <onykane@butler.cc.tut.fi>
Cc: www-rdf-comments@w3.org
Message-Id: <p06001f0ebbfa488fd715@[10.1.31.1]>
><snip>
>Then to the second part, about blank nodes in modelling:
>
>I still wonder that if blank nodes are not given explicit types
>(in the rdf:type sense) in applications (see e.g. the Primer, figure 6
>http://www.w3.org/TR/rdf-primer/#figure6), SW agents might have problems
>choosing between valid inferences (when RDF graphs get bigger, merged from
>different sources). This inference obviously must take place on top of RDF
>Semantics.

Can you elaborate? I don't quite see why an application would ever 
find itself in the position of needing to *choose between* valid 
inferences.

>Do you know if anyone has studied this? Or does it really matter whether I
>use blank nodes or invent a unique URIref for each node that could be
>modelled also as a blank node?

As to that last question, the is discussed in the remark on 
'skolemization' in the appendix to the semantics document. To a first 
approximation, speaking purely semantically: no,it does not matter. 
Provided you can be certain that no other assertions are made using 
the 'new' URIref, then you can use and invent a URIref to be the name 
of the anonymous entity indicated by the blank node. Its rather like 
the convention we use in natural language where you might be telling 
a story: "Someone - call him "Joe" - was walking down the street,..."

The difference comes up when you consider what other people can do. 
Suppose you choose to invent URIrefs and use them in your ontology, 
and publish it. Now I can use your URIrefs in *my* ontology, and can 
make assertions about these things, assertions which you might not 
think appropriate. By using a URIref you have given your 'existential 
entities' a public name, so have in effect introduced them into the 
common ground of the Semantic Web, where anyone can make assertions 
about them.  Using blank nodes is a kind of insurance against this, 
if you prefer to keep your entities genuinely anonymous.

>Note: In principle, an RDF authoring tool could easily do this for me and
>interpretations could just establish "bigger IS sets".

Right, it could perform Skolemization.  Note however that is not, 
strictly speaking, a valid step.

>  Since in a formal
>interpretation, a blank node is (existentially) matched against the things
>in IR, this seems to have little effect (it shifts the problem into
>defining the other sets and mappings in I).
>
>Do you think it is better to model e.g. persons with blank nodes (and do
>the identifying indirectly by defining a sufficient amount of properties)
>than making naming conventions like exstaff:85740 (parallel and redundant,
>in practice)? The RDF Primer seems to think this way(?)

I have no idea how to determine what is better. I tend to think in 
blank-node terms, myself, I have to confess. I think of a URIref as a 
blank node with a name attached. BUt this is only my own way of 
thinking, I hasten to add.

>However, it seems that the blank node approach (where "things are
>identified by their attributes") might provide fragile applications since
>it effectively assumes that the data (identifying attributes) is 100%
>correct

Well, we all make that assumption. A mistyped URIref might do all 
kinds of damage.

>(a mistyped or changed property will break the denotation and
>prohibit integrating data).
>
>In addition, suppose I'm asserting the following:
>
><#pat> <#child> [ <#age> "4" ] , [ <#age> "3" ].

? What does that mean? That the age is either 3 or 4? There is no way 
to exactly model this in RDFS, though an ALT container would probably 
do as a pragmatic device. Conforming RDF engines are not obliged to 
follow the intended semantics of ALT, however.  (It is quite tricky 
to implement; disjunctions raise the complexity level of inference to 
new heights.)

In OWL you could use a oneOf construction to define the class { 3,4 } 
and say that age was in that class, perhaps using rdfs:range.

>How would you meaningfully model this in RDF Schema or OWL (write a schema
>or an ontology to dictate its use)? And, if one wishes to write neutral
>structures in RDF, why not use containers or collections? (Are they too
>cumbersome for this ;-)

They are certainly cumbersome, indeed. I not sure what you mean by 'neutral'

>On the other hand, coming up with unique names (URIrefs) for e.g. persons,

Persons, sure. But suppose you want to say that all people have 10 
toes, 8 fingers and 2 thumbs. Do you want to have to come up with a 
name for every human digit?

>allows combining partial data from multiple sources: non-valid entailments
>(e.g. due to dated properties)  "do not drop out the important subjects in
>conclusions". (Obviously it is not safe to assume that RDF graphs are
>always "up to date".)

Being up-to-date is a whole other matter that cuts across this issue 
of URIrefs vs blank nodes. We deliberately postponed this issue to 
the future as being beyond the WG charter. Mixing existentials and 
time is known to be tricky, since things can cease to exist and come 
into existence. URIrefs have a similar basic problem, however.

>This discussion around the second issue is a bit vague (I'm working on
>some stuff in order to make it a bit more concrete)

Look forward to reading it.

Pat Hayes

>  and doesn't have to
>affect RDF Semantics as such. However, to my opinion, it affects the way
>e.g. blank nodes are used in modelling (and thus eventually comes back to
>formal semantics as well).
>
>Best regards,
>
>--Ossi


-- 
---------------------------------------------------------------------
IHMC	(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32501			(850)291 0667    cell
phayes@ihmc.us       http://www.ihmc.us/users/phayes
Received on Monday, 8 December 2003 10:51:34 UTC