RE: problems with concise bounded descriptions

> -----Original Message-----
> From: ext Peter F. Patel-Schneider 
> [mailto:pfps@research.bell-labs.com]
> Sent: 01 October, 2004 15:50
> To: Stickler Patrick (Nokia-TP-MSW/Tampere)
> Cc: www-rdf-interest@w3.org
> Subject: Re: problems with concise bounded descriptions
> 
> 
> From: <Patrick.Stickler@nokia.com>
> Subject: RE: problems with concise bounded descriptions
> Date: Fri, 1 Oct 2004 14:18:41 +0300
> 
> > 
> > 
> > > -----Original Message-----
> > > From: www-rdf-interest-request@w3.org
> > > [mailto:www-rdf-interest-request@w3.org]On Behalf Of ext Peter F.
> > > Patel-Schneider
> > > Sent: 01 October, 2004 02:40
> > > To: www-rdf-interest@w3.org
> > > Subject: problems with concise bounded descriptions
> 
> [...]
> 
> 
> > > The notion of Concise Bounded Descriptions (CBD) in this note 
> > > has a number
> > > of problems.
> > 
> > No doubt. I'm always keen to fix problems.
> > 
> > > The initial description of a CBD is severely underspecified.  
> > > According to
> > > the note, ``A [CBD] of a resource is a body of knowledge 
> about that
> > > resource which does not include any explicit knowledge 
> about any other
> > > resource which can be obtained separately from the same source.''
> > > 
> > > Problem 1:  Which source?
> > 
> > The source of statements (graph) from which the CBD is 
> being extracted.
> 
> OK, but why is this not stated here?  

[[
This document defines a concise bounded description of a resource in terms of an 
RDF graph, as an optimal unit of specific knowledge about that resource to be 
utilized by, and/or interchanged between, semantic web agents.
]]

[[
A concise bounded description can be defined in terms of an RDF graph as follows:

Given a node in an RDF graph...
]]


I'm not quite sure how I could be more explicit...

If you would like to offer better/clearer wording, I'd be happy
to have your suggestions.

> > > Problem 2:  What is ``explicit'' knowledge?
> > 
> > As in "information which is explicit and formally defined", i.e.
> > not in the mind of some human or expressed in a manner, such
> > as using natural language, which requires a human to interpret
> > to figure out, and may give rise to ambiguity, but rather 
> > "expressed in a machine understandable language" such as can
> > be used by automated semantic web agents.
> 
> OK, this is a good definition of explicit, but why is this not stated?


[[
... semantic web agents (at present at least) are not able to deal as 
well with the broad range of possible representations which might be 
associated with a resource; and in nearly all cases, are unable to make 
any use of such representations, as they are typically intended for human 
rather than machine consumption. Semantic web agents, not being anywhere 
near as intelligent as most humans, require information which is explicit 
and formally defined. In short, semantic web agents need concise, bounded 
descriptions of resources, expressed in a machine understandable language, 
rather than seemingly arbitrary representations which to agents are usually 
semantically opaque. 
]]

Again, I'm not sure why that doesn't work for
you, but again, feel free to suggest alternate
and/or additional wording.


> Also, this clarification of ```explicit'' exposes further problems.
> 
> Consider the RDF graph consisting of a single statement
> 
> 	ex:a ex:r ex:b .
> 
> Why is this statement not ``explicit kowledge about [another] resource
> which can be obtained separately from the same source''?  There is no
> reason not to postulate that the source cannot answer questions like
> ``return all the triples with ex:b as the object''. 

CBD makes no such presumption. The same source (agent) may respond
to all kinds of requests expressed in all kinds of manners, including
more general queries asking e.g. 

   ``return all the triples with ex:b as the object''

I don't see how that has any direct significance to the CBD spec.

> > > Problem 3:  What is ``obtain separately''?
> > 
> > By separate request/query.
> 
> OK, but why is this not stated here?

Because I wrote that part of the spec on a Thursday, and I *never*, 
absolutely *never* write the word "request" on a Thursday (and
when Friday rolled around, I forgot to go back and fix it... ;-)
 
> > > Problem 4:  A function that always returns nothing satisfies this
> > > description, as it certainly does not include any knowledge 
> > > (explicit or
> > > not) that be obtained (separately or not) from the same 
> > > source (or indeed
> > > any source at all).
> > 
> > I'm sorry. I don't see your point here.
> > 
> > If the agent recieving the request knows nothing about the
> > resource in question, then returning an empty graph is a 
> > pretty clear indication of that.
> 
> The point is that according to your rules, as expressed in 
> your initial
> description of a CBD, my process is in every way better than yours.

And what process is that???

> First, my process actually satisfies your initial description of a CBD
> whereas your does not.  Second, my process is optimal in that 
> it returns
> the minimum CBD, whereas you don't give any criteria for 
> determining the
> optimally of your process.

I have working, deployed code that demonstrates its utility.

> > > The definition of CBD in terms of a procedure on RDF 
> graphs also has
> > > serious problems.
> > 
> > If it does, then for sure, those should be addressed.
> > 
> > > Problem 5:  Given a node in an RDF graph, there is no 
> general way of
> > > determining which nodes in the graph are co-denotational with 
> > > that node.
> > 
> > Hmmmm.. isn't that what owl:sameAs is for?  
> 
> Not at all.  owl:sameAs tells you that two nodes in a graph are
> co-denotational.  However the absence of owl:sameAs does not 
> tell you that
> two nodes in a graph are not co-denotational.

And black is black and white is white. 

> Consider the following RDF graph:
> 
> 	ex:a ex:r ex:b .
> 
> Are ex:a and ex:b co-denotational?  

How could I know? How could an agent in possession of
the above knowledge know? How is this the least bit
an issue for the CBD definition?

If the agent *somehow* knows that ex:a and ex:b are
co-denotational, then any inquiries about either ex:a
or ex:b should return comparable CBDs. But if it doesn't
know, it doesn't know. Period. How can it tell you
something it doesn't know.

Quick, tell me, what color are my socks?! What? You
don't know? Well then you must be broken, or stupid
or something...

And yes, I agree, for me to suggest, or even think such
a thing is quite unreasonable, unfounded, and I should
be ashamed. Sorry ;-)


> > There is no reason why an agent cannot employ inference when
> > responding to a request for a CBD -- and in fact, this is 
> > precisely what the Nokia Semantic Web Server does, though
> > not full OWL inference (yet).
> 
> Irrelevant, see above.

???

> > The graph from which a CBD is extracted, can be one which
> > contains inferred statements, and need not corrspond to the
> > graph corresponding to a particular triples store. 
> 
> Irrelevant, see above.

???
 
> > > Consider, for example, the RDF graph:
> > > 	_:a ex:b _:c .
> > > 	_:d ex:e _:f .
> > > What is the CBD of _:a in this graph?
> > 
> >   _:a ex:b _:c .
> > 
> > Exactly where is the confusion? How would you expect it to be
> > different?
> >
> > I see no statements that would suggest that any of the
> > nodes in the above example graph are co-denotational
> > with any other.
> > 
> > Am I missing something?
> 
> Yes, indeed, you certainly are.  There is, of course, nothing 
> in the graph
> to say that any of the nodes in the graph are 
> co-denotational, but there is
> also nothing in the graph to say that they are not.  Given this
> uncertainty, why should the CBD of _:a *not* include
>  	_:d ex:e _:f .

Whatever you're smoking, can I please have some?  

> > > Problem 6:  This definition does not satisfy the initial 
> > > description of a
> > > CBD.  Consider, for example, the RDF graph:
> > > 	ex:a ex:b ex:c .
> > > 	ex:r rdf:type rdf:Statement .
> > > 	ex:r rdf:subject ex:a .
> > > 	ex:r rdf:predicate ex:b .
> > > 	ex:r rdf:object ex:c .
> > > the CBD of ex:a in this graph is the graph itself, but it 
> > > includes explicit
> > > information about ex:r, a potentially different resource.
> > 
> > OK. I can see a minor issue with some of the wording in that
> > it states that no other named resources will be described in
> > the CBD (presuming that reification resources are anonymous,
> > rather than named) but as the core of the algorithm is quite
> > clear about reification resources, this is a minor editorial
> > issue (that I will address).
> 
> Fine. That may solve this particular example.  What about the issues
> uncovered by your clarification of ``explicit'' above?

What issues ;-)

> > > Problem 7:  This definition does not provide enough information to
> > > distinguish the node from other distinguishable nodes in 
> the graph.
> > > Consider, for example, the RDF graph: 
> > > 	ex:r rdf:type owl:InverseFunctionalProperty .
> > > 	_:a ex:r _:b .
> > > 	_:b ex:r _:a .
> > > 	_:a ex:s "NODE A" .
> > > 	_:b ex:s "NODE B" .
> > > Then the CBD of _:a in this graph is
> > > 	_:x1 ex:r _:x2 .
> > > 	_:x2 ex:r _:x1 .
> > > which is the same as the CBD of _:b in this graph but _:a 
> and _:b are
> > > distinguishable in the graph and thus should have different CBDs.
> > 
> > Well, firstly, while it is possible to obtain a CBD of 
> > an anonymous node, the focus of a CBD is a resource
> > denoted by a URIref (not that I see that it's significant
> > to this, or any of these, examples).
> 
> I don't see any reason to restrict CBDs to URIrefs.  

They aren't.

But interchange of CBDs are (unless you resort to inter-agent
sharing of anonymous node labels, which is OK, but certainly
not a "pure" RDF solution).

> In fact, 
> the greatest
> need I see is for some way of returning the information a source knows
> about an anonymous node in a graph, as in 
>       Return the information about the instances of foaf:Person.
> This will need to return information about any blank nodes that have
> rdf:type foaf:Person.

I've considered adding to the URIQA interface (not the
definition of a CBD) the ability to ask for a description
of a resource by providing (a) the URI of an inverse functional
property and (b) a property value, but that's an issue of the
query interface and *not* the definition of CBDs. Once
the responding agent has identified a node in a graph (even
via a fully general DAWG query!) then it can produce a
CBD of the resource denoted by that node. In the case of
anonymous nodes, it would have to employ some non-RDF
means of indicating which anonymous node denotes the
particular resource in question -- but again, that's a
query interface issue, not an issue with the CBD definition.

> > I.e. the beginning point for the extraction of the
> > CBD by the responding agent, posessing the knowledge
> > in question, is a graph node, but the request from
> > one agent to another can only be made in terms of
> > a URIref.
> 
> This would result in a drastic decrease in utility.

My real-world experience has been quite different.

And I'm happy putting my code where my mouth is, and I
do that daily, and the code is available as open source,
and so...  well... I guess that's enough said...

CBDs are not intended to be the only means of querying,
discovering, interchanging, and publishing knowledge.

More general query solutions are necessary, and to that
end, CBDs are not themselves a query solution, but merely
a component of many possible query solutions.

> > This probably could be made clearer in the CBD spec.
> > 
> > Secondly, The CBD of _:a, in the context of the agent
> > recieving/responding to the request would be
> > 
> >   	_:a ex:r _:b .
> >   	_:b ex:r _:a .
> >  	_:a ex:s "NODE A" .
> >  	_:b ex:s "NODE B" .
> >
> > which, yes, entails
> 
> 		Entails?  Where did this come from?
> > 
> >   	_:x1 ex:r _:x2 .
> >   	_:x2 ex:r _:x1 .
> >  	_:x1 ex:s "NODE A" .
> >  	_:x2 ex:s "NODE B" .
> 
> Hmm. In some sense we are both wrong here.  The CBD of :_a is either
> 
>    	_:x1 ex:r _:x2 .
>    	_:x2 ex:r _:x1 .
>   	_:x1 ex:s "NODE A" .
> 
> (if you take ``include only'' to trump ``include all'') or
> 
>    	_:x1 ex:r _:x2 .
>    	_:x2 ex:r _:x1 .
> 
> (if you don't).  I took the former reading, which is probably not as
> reasonable as the latter.
> 
> However, there is no way that 
> 
>   	_:x2 ex:s "NODE B" .
> 
> is in the CBD of _:a, because _:b is the subject of a 
> statement ``where the
> predicate is an owl:InverseFunctionalProperty''.

Huh?! Well, if you're going to just pull statements
out of your, er, ahem, hat ;-) then what's the point
of examples...?!

In the input example graph, *nowhere* is there any statement

  ex:r rdf:type owl:InverseFunctionalProperty .

Or have I gone blind?

If you meant to put it there, fair enough, but since
it isn't, it's a bit hard to debate an alternative
outcome to what I arrived out...

> > neither of which are not going to be particularly useful
> > to any requesting agent since (a) it couldn't
> > ask for the CBD of a resource denoted (solely)
> > by an anonymous node in some other agent's
> > knowledge base (since anonymous node labels are
> > system, and hence agent, local) and (b) even
> > if the agent was somehow able to make the request
> > (which it couldn't) it wouldn't know which anonymous
> > node(s) in the response actually denote the resource
> > of interest.
> 
> See above.  The CBD of a blank node is probably of more 
> utility than the
> CBD of a URIref node.

And you can have either -- though it's damn hard (albeit
not impossible) to interchange the CBD of a blank node
between arbitrary SW agents.

> However, a slight expansion of my example gives rise to the 
> same problem
> without taking the CBD of a blank node.
> 
> Consider the following graph:
> 
> 	ex:z ex:p _:a .
>  	ex:r rdf:type owl:InverseFunctionalProperty .

Ahh... *there* it is... ;-)

>  	_:a ex:r _:b .
>  	_:b ex:r _:a .
>  	_:a ex:s "NODE A" .
>  	_:b ex:s "NODE B" .
> 
> The CBD of ex:z in this graph is
> 
> 	ex:z ex:p _:x1 .
>  	_:x1 ex:r _:x2 .
>  	_:x2 ex:r _:x1 .

Correct.

Patrick

 

Received on Friday, 1 October 2004 13:25:13 UTC