- From: Joshua Tauberer <tauberer@for.net>
- Date: Mon, 21 Nov 2005 11:39:25 -0500
- To: Reto Bachmann-Gmür <reto@gmuer.ch>
- CC: Andreas Andreakis <andreas.andreakis@gmx.de>, semantic-web at W3C <semantic-web@w3c.org>

Reto Bachmann-Gmür wrote: > sound reasonable, still quite insecure about terminology, would this be > the same to say "... g rdf-entails g1 and g1 rdf-entails g"? Yeah, I think so. > I have been looking at the concept of "rdf molecules" [1] which seems to > be very close to MSG. Looks very similar. I'll have to read it more carefully later. > The idea sounds good, it seems to reduce the problem to > subgraph-isomorphism (check if b is a subgraph of (g-b)) for which I > guess an algorithm similar to the one described by Jeremy Carroll for > graph-isomorphism [2] could be used. From a quick skim, it looks that that's only applicable for (graph-graph) isomorphism rather than entailment (in this case whether there is a subgraph isomorphic to another graph). You could use any SPARQL implementation, really. The blank nodes are the variables. > However I don't see how your algorithm would successfully be applied to > g: > [ > foaf:knows [ foaf:name "Jo]; > foaf:knows [ foaf:name "Jo] > ]. Right, I was wrong to only allow b to be a MSG. So it's more complicated. Here's a second try: Find all subgraphs b of g such that (g-b) entails b, which is to say that there is a mapping M from blank nodes *only* in b to nodes in (g-b) such that (g-b) contains M(b). The difference is that if b contains a blank node that is also in (g-b), then it's effectively a constant when comparing b to (g-b). Hopefully that's right. b doesn't have to be a MSG, but some other constraints fall out of that. b still has to have all of the statements about some blank node so that M does some mapping. And every statement in b must contain a blank node only in b. (A statement can't be in both b and (g-b), so M has to do some mapping for each statement so that M(b) might possibly be a subset of (g-b).) Before there were at most N b's to look at (at most one MSG per blank node, some sharing MSGs), and now there are possibly 2^N (each b contains all of the statements, or not all, of each blank node). Also the only b's worth considering are connected graphs (every node is reachable from every other node), since otherwise b is just the union of two smaller graphs which could be considered on their own. I think that this means that all of the b's worth considering are subgraphs of MSGs, and further only need to be entailed by the MSG containing b (and not all of (g-b)), which narrows down things a lot. So a new algorithm would be to find all MSGs b, make b lean by checking all 2^N relevant subgraphs (except the N here is just the blank nodes in b; or something more efficient), and then checking if this new b is entailed by (g-b). If this is right. -- - Joshua Tauberer http://taubz.for.net ** Nothing Unreal Exists **

Received on Monday, 21 November 2005 16:40:35 UTC