- From: Yuzhong Qu <yzqu@seu.edu.cn>
- Date: Wed, 23 Nov 2005 10:42:23 +0800
- To: "Joshua Tauberer" <tauberer@for.net>, =?UTF-8?Q?Reto_Bachmann-Gm=C3=BCr? = <reto@gmuer.ch>
- Cc: "semantic-web at W3C" <semantic-web@w3c.org>
That's a good idea. It reminds me of the following: (Sen5) For triples with blank node, they are divided into several groups according to the equivalence closure of a relation, which is defined as follows: if two triples have a same blank node as their ends (the subject or object of the RDF statement), we call the two triples are related with blank node. See a poster at WWW2003 [1] for more details. BTW, what's the formal definition of MSG? where can I find the complete definition of MSG? Yuzhong Qu [1] http://www2003.org/cdrom/papers/poster/p297/p297-qu.html Reto Bachmann-Gmür wrote: > How do I check the 2^N subgraphs when making b lean? So just to go back a step (to the beginning, really)- When checking if, for g=a+b, a entails b, there must be a mapping M from blank nodes only in b (not in a) to nodes such that M 'applied' to b is a subset of a. Let's call those blank nodes variables. Further, no statement in b can have no variables in it. (If there was such a statement s in b, M(b) would still contain s. a and b don't overlap, so s couldn't be in a, so M(b) couldn't be a subset of a.) From this, we can characterize every b by its variables. If n is a variable, then every statement mentioning n is in b. And b contains only those statements in g that mention a variable. Rather than choosing b, we can choose a set of variables and from that generate b. So rather than considering all 2^N subgraphs of g where N is the number of statements (each statement can be in or out of the subgraph), one can just consider the 2^N possible choices of variables where N is the number of blank nodes (each blank node can be in or out of the choice of variables). Then for each of the 2^N permutations of blank nodes, choose the subgraph b that has just the statements that mention a variable. I think that answers your question, but I'll continue rambling a proof about how MSGs get involved: We can make a further restriction on b that it can't be split into two parts b1 and b2 where each variable is in either b1 or b2 but not both. (This is like a MSG where you can't cut off a piece that contains all of the instances of some variables, without taking the whole MSG.) If b can be split that way, then a entails b just when a entails b1 and b2 separately, so we don't need to check b if we check b1 and b2. This gives us the result that if b is a MSG (in which case every blank node in b is a variable and does not appear outside of b), then no superset of b needs to be checked. So the largest chunks that need to be checked are MSGs. We still (so far) need to look at subgraphs of MSGs, which means taking an MSG and looking at its 2^N subgraphs where N is the number of blank nodes within that MSG. But we still only want to look at 'unsplittable' subgraphs. If node x is connected to node z only through node y, and if we take the variables to be x and z but not y, then b can be split into a part containing x and a part containing z, so this was a bad choice of variables. There might be a way to quickly find the unsplittable parts of a MSG using a connectivity map or something... but that needs more thinking. -- - Joshua Tauberer http://taubz.for.net ** Nothing Unreal Exists **
Received on Wednesday, 23 November 2005 02:42:44 UTC