RE: problems with concise bounded descriptions from Patrick.Stickler@nokia.com on 2004-10-01 (www-rdf-interest@w3.org from October 2004)

From: <Patrick.Stickler@nokia.com>
Date: Fri, 1 Oct 2004 09:28:31 +0300
To: <eric@w3.org>, <pfps@research.bell-labs.com>
Cc: <www-rdf-interest@w3.org>
Message-ID: <1E4A0AC134884349A21955574A90A7A50A1D93@trebe051.ntc.nokia.com>
> -----Original Message-----
> From: www-rdf-interest-request@w3.org
> [mailto:www-rdf-interest-request@w3.org]On Behalf Of ext Eric
> Prud'hommeaux
> Sent: 01 October, 2004 05:18
> To: Peter F. Patel-Schneider
> Cc: www-rdf-interest@w3.org
> Subject: Re: problems with concise bounded descriptions
> 
> 
> Not being the author, I will address your points to the best of my
> understanding. 

I'm working on a posting to address the issues raised in your
comments to the submission.

I'm also working on my own response to Peter's comments, and will
at the same time digest what you've responded here. But shortly, a
comment about the following:

> The main problem *I* see with CBDs is that they favor a particular
> expression of data, i.e. arcs-out rather than arcs-in. This could bias
> developers as they may wish to make sure that their data is
> expressible in a CBD even at some cost to clarity. 

Firstly, CBDs are resource-centric, and meant to be the most concise,
smallest body of knowledge about a particular named resource that
an agent can obtain per a single request, based on the URI denoting
that resource.

Secondly, CBDs are not intended as a replacement/alternate to a more
general query solution.

Thirdly, CBDs are not intended to be the only possible form of response
to a question "tell me about this thing".

Fourthly, CBDs identify a subset of a graph, and I honestly can't
imagine
how that would constrain or influence how a given developer would
express
knowledge about resources, since even if CBDs are provided by some
service,
there will likely be other means of access to that information. So I'd
need to see some pretty explicit and motivating use cases before I'm 
convinced that your main problem with CBDs is a real issue.

Finally, while there can be some application areas where "arcs-in"
information is useful/necessary, in many applications, it can result
in a huge number of statements in a graph. As an extreme case, consider
a request for a CBD for rdf:Resource where inference is enabled...
Again, CBDs are not intended to replace a general query facility, or
some other form of "resource view" which would accomodate retrieval
of "arcs-in" knowledge.

> while there are some applications 
> I think the recipe
> also needs some text to deal with cyclic graphs of bNodes, but that's
> a minor point.

Agreed. And thanks for pointing that issue out. I also agree
that it's a minor issue and fixed with a single check in the
algoritm to avoid infinite loops.

(the present implementation is expressed as inference rules, not
as a linear set of steps, so this problem does not arise, hence
it being overlooked).

Cheers,

Patrick

> On Thu, Sep 30, 2004 at 07:39:32PM -0400, Peter F. 
> Patel-Schneider wrote:
> > 
> > In the DAWG message archive I came across a reference to a 
> W3C member
> > submission from Nokia on Concise Bounded Descriptions
> > http://www.w3.org/Submission/CBD/.
> > 
> > The notion of Concise Bounded Descriptions (CBD) in this 
> note has a number
> > of problems.
> > 
> > The initial description of a CBD is severely 
> underspecified.  According to
> > the note, ``A [CBD] of a resource is a body of knowledge about that
> > resource which does not include any explicit knowledge 
> about any other
> > resource which can be obtained separately from the same source.''
> > 
> > Problem 1:  Which source?
> 
> The query service.
> 
> > Problem 2:  What is ``explicit'' knowledge?
> 
> I'm not sure I would have chosen ``explicit'', but I believe this is
> the set of arcs-out from a resource which is reached in a CBD
> traversal. All arcs-out from R1 are included in the CBD. If that graph
> involves R2 (and R2 isn't a literal or bNode), the client can ask
> about R2 in a separate request. Thus, arcs-out from R2 are not
> included in R1's CBD.
> 
> Perhaps ``minutiae'' would be better?
> 
> > Problem 3:  What is ``obtain separately''?
> 
> Subsequent query.
> 
> > Problem 4:  A function that always returns nothing satisfies this
> > description, as it certainly does not include any knowledge 
> (explicit or
> > not) that be obtained (separately or not) from the same 
> source (or indeed
> > any source at all).
> 
> Yes, but it is not compiant with the recipe in the
> specification. Perhaps the description could be amended to make it
> more clear, but I wouldn't expect it to stand on it's own as the
> definition.
> 
> > The definition of CBD in terms of a procedure on RDF graphs also has
> > serious problems.
> > 
> > Problem 5:  Given a node in an RDF graph, there is no general way of
> > determining which nodes in the graph are co-denotational 
> with that node.
> > Consider, for example, the RDF graph:
> > 	_:a ex:b _:c .
> > 	_:d ex:e _:f .
> > What is the CBD of _:a in this graph?
> 
> Being a pragmatist (for which I recieve the occasional slap), I would
> say we are responding with a CBD of what we *do* know about _:a, and
> thusly return only the first arc. If we later learn that _:a and _:d
> are the same arc, and the client queris again, they get more arcs, but
> nothing contradictory.
> 
> > Problem 6:  This definition does not satisfy the initial 
> description of a
> > CBD.  Consider, for example, the RDF graph:
> > 	ex:a ex:b ex:c .
> > 	ex:r rdf:type rdf:Statement .
> > 	ex:r rdf:subject ex:a .
> > 	ex:r rdf:predicate ex:b .
> > 	ex:r rdf:object ex:c .
> > the CBD of ex:a in this graph is the graph itself, but it 
> includes explicit
> > information about ex:r, a potentially different resource.
> 
> I haven't really explored CBDs of reifications. Patrick, do you have
> any fun use cases for this? Regardless, Peter, do you have any
> suggested words for Patrick to include the reification arcs in the
> initial description?
> 
> > Problem 7:  This definition does not provide enough information to
> > distinguish the node from other distinguishable nodes in the graph.
> > Consider, for example, the RDF graph: 
> > 	ex:r rdf:type owl:InverseFunctionalProperty .
> > 	_:a ex:r _:b .
> > 	_:b ex:r _:a .
> > 	_:a ex:s "NODE A" .
> > 	_:b ex:s "NODE B" .
> > Then the CBD of _:a in this graph is
> > 	_:x1 ex:r _:x2 .
> > 	_:x2 ex:r _:x1 .
> > which is the same as the CBD of _:b in this graph but _:a 
> and _:b are
> > distinguishable in the graph and thus should have different CBDs.
> 
> Yeah, but nothing else sovles that either. They're ambiguous to the
> server and they're ambiguous to the client. The only additional info
> that the server has is that there exists in the domain of discourse
> another bNode. I don't think it's worth telling the client about it.
> 
> > (Definition: Two blank nodes, n1 and n2, are 
> indistinguishable in a graph G
> > if G with n1 mapped to n2 and n2 mapped to n1 is 
> graph-equal to G (i.e.,
> > thes sets of triples are exactly the same).  Any node is 
> indistinguishable
> > from itself.  Two literal nodes are indistinguishable if 
> they mean the same
> > literal value.  All other pairs of nodes are distinguishable.)
> -- 
> -eric
> 
> office: +81.466.49.1170 W3C, Keio Research Institute at SFC,
>                         Shonan Fujisawa Campus, Keio University,
>                         5322 Endo, Fujisawa, Kanagawa 252-8520
>                         JAPAN
>         +1.617.258.5741 NE43-344, MIT, Cambridge, MA 02144 USA
> cell:   +1.857.222.5741 (does not work in Asia)
> 
> (eric@w3.org)
> Feel free to forward this message to any list for any purpose 
> other than
> email address distribution.
>
Received on Friday, 1 October 2004 06:29:25 UTC