Re: problems with concise bounded descriptions from Peter F. Patel-Schneider on 2004-10-01 (www-rdf-interest@w3.org from October 2004)

From: Peter F. Patel-Schneider <pfps@research.bell-labs.com>
Date: Fri, 01 Oct 2004 08:49:44 -0400 (EDT)
To: Patrick.Stickler@nokia.com
Cc: www-rdf-interest@w3.org
Message-Id: <20041001.084944.81685008.pfps@research.bell-labs.com>
From: <Patrick.Stickler@nokia.com>
Subject: RE: problems with concise bounded descriptions
Date: Fri, 1 Oct 2004 14:18:41 +0300

> 
> 
> > -----Original Message-----
> > From: www-rdf-interest-request@w3.org
> > [mailto:www-rdf-interest-request@w3.org]On Behalf Of ext Peter F.
> > Patel-Schneider
> > Sent: 01 October, 2004 02:40
> > To: www-rdf-interest@w3.org
> > Subject: problems with concise bounded descriptions

[...]


> > The notion of Concise Bounded Descriptions (CBD) in this note 
> > has a number
> > of problems.
> 
> No doubt. I'm always keen to fix problems.
> 
> > The initial description of a CBD is severely underspecified.  
> > According to
> > the note, ``A [CBD] of a resource is a body of knowledge about that
> > resource which does not include any explicit knowledge about any other
> > resource which can be obtained separately from the same source.''
> > 
> > Problem 1:  Which source?
> 
> The source of statements (graph) from which the CBD is being extracted.

OK, but why is this not stated here?  

> > Problem 2:  What is ``explicit'' knowledge?
> 
> As in "information which is explicit and formally defined", i.e.
> not in the mind of some human or expressed in a manner, such
> as using natural language, which requires a human to interpret
> to figure out, and may give rise to ambiguity, but rather 
> "expressed in a machine understandable language" such as can
> be used by automated semantic web agents.

OK, this is a good definition of explicit, but why is this not stated?

Also, this clarification of ```explicit'' exposes further problems.

Consider the RDF graph consisting of a single statement

	ex:a ex:r ex:b .

Why is this statement not ``explicit kowledge about [another] resource
which can be obtained separately from the same source''?  There is no
reason not to postulate that the source cannot answer questions like
``return all the triples with ex:b as the object''. 

> > Problem 3:  What is ``obtain separately''?
> 
> By separate request/query.

OK, but why is this not stated here?

> > Problem 4:  A function that always returns nothing satisfies this
> > description, as it certainly does not include any knowledge 
> > (explicit or
> > not) that be obtained (separately or not) from the same 
> > source (or indeed
> > any source at all).
> 
> I'm sorry. I don't see your point here.
> 
> If the agent recieving the request knows nothing about the
> resource in question, then returning an empty graph is a 
> pretty clear indication of that.

The point is that according to your rules, as expressed in your initial
description of a CBD, my process is in every way better than yours.
First, my process actually satisfies your initial description of a CBD
whereas your does not.  Second, my process is optimal in that it returns
the minimum CBD, whereas you don't give any criteria for determining the
optimally of your process.


> > The definition of CBD in terms of a procedure on RDF graphs also has
> > serious problems.
> 
> If it does, then for sure, those should be addressed.
> 
> > Problem 5:  Given a node in an RDF graph, there is no general way of
> > determining which nodes in the graph are co-denotational with 
> > that node.
> 
> Hmmmm.. isn't that what owl:sameAs is for?  

Not at all.  owl:sameAs tells you that two nodes in a graph are
co-denotational.  However the absence of owl:sameAs does not tell you that
two nodes in a graph are not co-denotational.

Consider the following RDF graph:

	ex:a ex:r ex:b .

Are ex:a and ex:b co-denotational?  

> There is no reason why an agent cannot employ inference when
> responding to a request for a CBD -- and in fact, this is 
> precisely what the Nokia Semantic Web Server does, though
> not full OWL inference (yet).

Irrelevant, see above.

> The graph from which a CBD is extracted, can be one which
> contains inferred statements, and need not corrspond to the
> graph corresponding to a particular triples store. 

Irrelevant, see above.

> > Consider, for example, the RDF graph:
> > 	_:a ex:b _:c .
> > 	_:d ex:e _:f .
> > What is the CBD of _:a in this graph?
> 
>   _:a ex:b _:c .
> 
> Exactly where is the confusion? How would you expect it to be
> different?
>
> I see no statements that would suggest that any of the
> nodes in the above example graph are co-denotational
> with any other.
> 
> Am I missing something?

Yes, indeed, you certainly are.  There is, of course, nothing in the graph
to say that any of the nodes in the graph are co-denotational, but there is
also nothing in the graph to say that they are not.  Given this
uncertainty, why should the CBD of _:a *not* include
 	_:d ex:e _:f .

> > Problem 6:  This definition does not satisfy the initial 
> > description of a
> > CBD.  Consider, for example, the RDF graph:
> > 	ex:a ex:b ex:c .
> > 	ex:r rdf:type rdf:Statement .
> > 	ex:r rdf:subject ex:a .
> > 	ex:r rdf:predicate ex:b .
> > 	ex:r rdf:object ex:c .
> > the CBD of ex:a in this graph is the graph itself, but it 
> > includes explicit
> > information about ex:r, a potentially different resource.
> 
> OK. I can see a minor issue with some of the wording in that
> it states that no other named resources will be described in
> the CBD (presuming that reification resources are anonymous,
> rather than named) but as the core of the algorithm is quite
> clear about reification resources, this is a minor editorial
> issue (that I will address).

Fine. That may solve this particular example.  What about the issues
uncovered by your clarification of ``explicit'' above?

> > Problem 7:  This definition does not provide enough information to
> > distinguish the node from other distinguishable nodes in the graph.
> > Consider, for example, the RDF graph: 
> > 	ex:r rdf:type owl:InverseFunctionalProperty .
> > 	_:a ex:r _:b .
> > 	_:b ex:r _:a .
> > 	_:a ex:s "NODE A" .
> > 	_:b ex:s "NODE B" .
> > Then the CBD of _:a in this graph is
> > 	_:x1 ex:r _:x2 .
> > 	_:x2 ex:r _:x1 .
> > which is the same as the CBD of _:b in this graph but _:a and _:b are
> > distinguishable in the graph and thus should have different CBDs.
> 
> Well, firstly, while it is possible to obtain a CBD of 
> an anonymous node, the focus of a CBD is a resource
> denoted by a URIref (not that I see that it's significant
> to this, or any of these, examples).

I don't see any reason to restrict CBDs to URIrefs.  In fact, the greatest
need I see is for some way of returning the information a source knows
about an anonymous node in a graph, as in 
      Return the information about the instances of foaf:Person.
This will need to return information about any blank nodes that have
rdf:type foaf:Person.

> I.e. the beginning point for the extraction of the
> CBD by the responding agent, posessing the knowledge
> in question, is a graph node, but the request from
> one agent to another can only be made in terms of
> a URIref.

This would result in a drastic decrease in utility.

> This probably could be made clearer in the CBD spec.
> 
> Secondly, The CBD of _:a, in the context of the agent
> recieving/responding to the request would be
> 
>   	_:a ex:r _:b .
>   	_:b ex:r _:a .
>  	_:a ex:s "NODE A" .
>  	_:b ex:s "NODE B" .
>
> which, yes, entails

		Entails?  Where did this come from?
> 
>   	_:x1 ex:r _:x2 .
>   	_:x2 ex:r _:x1 .
>  	_:x1 ex:s "NODE A" .
>  	_:x2 ex:s "NODE B" .

Hmm. In some sense we are both wrong here.  The CBD of :_a is either

   	_:x1 ex:r _:x2 .
   	_:x2 ex:r _:x1 .
  	_:x1 ex:s "NODE A" .

(if you take ``include only'' to trump ``include all'') or

   	_:x1 ex:r _:x2 .
   	_:x2 ex:r _:x1 .

(if you don't).  I took the former reading, which is probably not as
reasonable as the latter.

However, there is no way that 

  	_:x2 ex:s "NODE B" .

is in the CBD of _:a, because _:b is the subject of a statement ``where the
predicate is an owl:InverseFunctionalProperty''.
 
> neither of which are not going to be particularly useful
> to any requesting agent since (a) it couldn't
> ask for the CBD of a resource denoted (solely)
> by an anonymous node in some other agent's
> knowledge base (since anonymous node labels are
> system, and hence agent, local) and (b) even
> if the agent was somehow able to make the request
> (which it couldn't) it wouldn't know which anonymous
> node(s) in the response actually denote the resource
> of interest.

See above.  The CBD of a blank node is probably of more utility than the
CBD of a URIref node.

However, a slight expansion of my example gives rise to the same problem
without taking the CBD of a blank node.

Consider the following graph:

	ex:z ex:p _:a .
 	ex:r rdf:type owl:InverseFunctionalProperty .
 	_:a ex:r _:b .
 	_:b ex:r _:a .
 	_:a ex:s "NODE A" .
 	_:b ex:s "NODE B" .

The CBD of ex:z in this graph is

	ex:z ex:p _:x1 .
 	_:x1 ex:r _:x2 .
 	_:x2 ex:r _:x1 .

which does not provide sufficient information to extract all the
information the server knows about ex:z.  

> Patrick

Peter F. Patel-Schneider
Bell Labs Research
Received on Friday, 1 October 2004 12:43:56 UTC