- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Thu, 30 Sep 2004 22:17:32 -0400
- To: "Peter F. Patel-Schneider" <pfps@research.bell-labs.com>
- Cc: www-rdf-interest@w3.org
- Message-ID: <20041001021732.GB19027@w3.org>
Not being the author, I will address your points to the best of my understanding. But I would also like to point out that something like a CBD would allow information servers to respond to queries without any specific understanding of the queried object. Clients would be able to expect a certain pattern from such queries. The world would be a little bit more communicative and predictable. Client C1 wants to know about a resource R1. Server S1 has some graph that involves that R1. S1 will respond with whatever it wants. If it knows the type, it will typically respond with an application-specfic graph, for instance, a graph that's particularly suited to describing a foaf:Person. If it doesn't know how to or care to tailor the response, it can send it's notion of a generically helpful graph. The annotea server responds with a subject query, that is, all the arcs coming from the queried node. Arcs out seemd to be more useful than arcs in, and we never had a compelling reason to do both. In this sense, the Annotea response is a cheaper but less helpful form of a CBD. It worked for our purposes, but a CBD would be more helpful in a bNode-laden graph. A client programmer that expects either an application-specific response or a CBD can more effectively use the returned data. Without a convention, different services will respond with their own slant on what's helpful. An ontologist may choose to respond with any type arcs coming from R1, plus a cloud of ontology surrounding those types. This information may be helpful for the Protoge user but not the foaf crawler. Another app may choose to respond with arcs-in and arcs-out. Given no convention, the client programmer must deal with all likely responses. With a convention, he/she may expect soemthing at least as rich as a CBD, and maybe more, if the server has a special understanding of R1. This puts burden on the protoge user to ask a special query to get the ontology cloud, but at least the clients know what to reasonably expect and code for. The main problem *I* see with CBDs is that they favor a particular expression of data, i.e. arcs-out rather than arcs-in. This could bias developers as they may wish to make sure that their data is expressible in a CBD even at some cost to clarity. I think the recipe also needs some text to deal with cyclic graphs of bNodes, but that's a minor point. On Thu, Sep 30, 2004 at 07:39:32PM -0400, Peter F. Patel-Schneider wrote: > > In the DAWG message archive I came across a reference to a W3C member > submission from Nokia on Concise Bounded Descriptions > http://www.w3.org/Submission/CBD/. > > The notion of Concise Bounded Descriptions (CBD) in this note has a number > of problems. > > The initial description of a CBD is severely underspecified. According to > the note, ``A [CBD] of a resource is a body of knowledge about that > resource which does not include any explicit knowledge about any other > resource which can be obtained separately from the same source.'' > > Problem 1: Which source? The query service. > Problem 2: What is ``explicit'' knowledge? I'm not sure I would have chosen ``explicit'', but I believe this is the set of arcs-out from a resource which is reached in a CBD traversal. All arcs-out from R1 are included in the CBD. If that graph involves R2 (and R2 isn't a literal or bNode), the client can ask about R2 in a separate request. Thus, arcs-out from R2 are not included in R1's CBD. Perhaps ``minutiae'' would be better? > Problem 3: What is ``obtain separately''? Subsequent query. > Problem 4: A function that always returns nothing satisfies this > description, as it certainly does not include any knowledge (explicit or > not) that be obtained (separately or not) from the same source (or indeed > any source at all). Yes, but it is not compiant with the recipe in the specification. Perhaps the description could be amended to make it more clear, but I wouldn't expect it to stand on it's own as the definition. > The definition of CBD in terms of a procedure on RDF graphs also has > serious problems. > > Problem 5: Given a node in an RDF graph, there is no general way of > determining which nodes in the graph are co-denotational with that node. > Consider, for example, the RDF graph: > _:a ex:b _:c . > _:d ex:e _:f . > What is the CBD of _:a in this graph? Being a pragmatist (for which I recieve the occasional slap), I would say we are responding with a CBD of what we *do* know about _:a, and thusly return only the first arc. If we later learn that _:a and _:d are the same arc, and the client queris again, they get more arcs, but nothing contradictory. > Problem 6: This definition does not satisfy the initial description of a > CBD. Consider, for example, the RDF graph: > ex:a ex:b ex:c . > ex:r rdf:type rdf:Statement . > ex:r rdf:subject ex:a . > ex:r rdf:predicate ex:b . > ex:r rdf:object ex:c . > the CBD of ex:a in this graph is the graph itself, but it includes explicit > information about ex:r, a potentially different resource. I haven't really explored CBDs of reifications. Patrick, do you have any fun use cases for this? Regardless, Peter, do you have any suggested words for Patrick to include the reification arcs in the initial description? > Problem 7: This definition does not provide enough information to > distinguish the node from other distinguishable nodes in the graph. > Consider, for example, the RDF graph: > ex:r rdf:type owl:InverseFunctionalProperty . > _:a ex:r _:b . > _:b ex:r _:a . > _:a ex:s "NODE A" . > _:b ex:s "NODE B" . > Then the CBD of _:a in this graph is > _:x1 ex:r _:x2 . > _:x2 ex:r _:x1 . > which is the same as the CBD of _:b in this graph but _:a and _:b are > distinguishable in the graph and thus should have different CBDs. Yeah, but nothing else sovles that either. They're ambiguous to the server and they're ambiguous to the client. The only additional info that the server has is that there exists in the domain of discourse another bNode. I don't think it's worth telling the client about it. > (Definition: Two blank nodes, n1 and n2, are indistinguishable in a graph G > if G with n1 mapped to n2 and n2 mapped to n1 is graph-equal to G (i.e., > thes sets of triples are exactly the same). Any node is indistinguishable > from itself. Two literal nodes are indistinguishable if they mean the same > literal value. All other pairs of nodes are distinguishable.) -- -eric office: +81.466.49.1170 W3C, Keio Research Institute at SFC, Shonan Fujisawa Campus, Keio University, 5322 Endo, Fujisawa, Kanagawa 252-8520 JAPAN +1.617.258.5741 NE43-344, MIT, Cambridge, MA 02144 USA cell: +1.857.222.5741 (does not work in Asia) (eric@w3.org) Feel free to forward this message to any list for any purpose other than email address distribution.
Received on Friday, 1 October 2004 02:17:32 UTC