# Re: blank nodes out the wazoo

From: pat hayes <phayes@ihmc.us>
Date: Fri, 13 Jun 2003 14:20:53 -0500
Message-Id: <p0521060abb0f89259095@[10.0.100.24]>
To: Jeremy Carroll <jjc@hplb.hpl.hp.com>

```
>NP
>
>Take two directed graphs G and H
>
>Construct an RDF Graph R as follows.
>
>Let the nodes of R = V(G) U V(H) U { g, h, x }  (all distinct blank nodes)
>
>Let the triples of R all have predicate rdf:value (which we will omit)
>
>Let the triples of R be E(G) U E(H) U { <g, g'> | g' in V(G) }
>                      U { <h, h'> | h' in V(H) } U { <x, g>, <x, h> }
>
>then if R is irredundant we have that H is not a subgraph of G.

But not necessarily the reverse. For example let G be
<a b> <b c> <c a>
and H be
<d e> <e f> <d f>
then the instance a=d and b=e gives a redundancy (instance which is a
proper sub-RDFgraph) but H and G are not subgraphs of each other.

>The directed subgraph problem is NP complete, which forms the
>interesting 90% of a proof of NP complexity of RDF graph irredunancy.
>
>Your approach neglects the long distance interactions involved in
>blank node structures.

Those seem irrelevant since we can detect instantiation locally.

One does not need to get into the subgraph problem. The only question
you have to ask is, is this triple redundant? The way to find that
out is to see if it can be instantiated into another triple in the
graph, which can take at most one check per other triple. If it can,
then delete it, remember the instance mapping, and start again from
the top. If it can't, try the next triple. You can detect
irredundancy in n|2 checks of one triple instantiating another.

The graph gets smaller each time, so generating an irredundant
subgraph is going to be polynomial at worst (not sure what degree, to
be honest, but I think cubic, maybe n|3 logN or something like that.)

It doesn't matter if you delete triple T1 because it instantiates to
T2, then later also delete T2, since if T2 instantiates to T3 then so
would T1 have done: instance is transitive.

Pat

>Jeremy
>
>
>pat hayes wrote:
>
>>>NO
>>>
>>>The problem of eliminating such blank nodes is plausibly NP complete.
>>>
>>>Certainly GI complete, i.e. exponential.
>>
>>
>>Actually, I think its only cubic.
>>for each bnode B
>>     for each triple T containing B
>>         if there is a triple instantiating T, remove T
>>     loop
>>loop
>>
>>But OK< pretend I never mentioned it.
>>
>>Pat
>>
>>>(See related work on irredunancy of conceptual graphs
>>>[[[ Michel Chien et Marie-Laure Mugnier, Conceptual Graphs:
>>>fundamental notions in Revue d'Intelligence Atificielle, Vol 5, no
>>>4, 1992, pp 365-406.
>>>]]
>>>most applicable to RDFS though)
>>>
>>>Jeremy
>>>
>>>
>>>pat hayes wrote:
>>>
>>>>While fixing a silly mistake in the MT document, I noticed the
>>>>following. We require that RDF graphs contain no redundancies in
>>>>the sense that the same triple cannot occur more than once in the
>>>>graph. However, they can contain redundancies in the sense that a
>>>>triple with a bnode in it can be duplicated with a different
>>>>bnode, even though the resulting triples would look the same in a
>>>>graph diagram. The resulting graph has no extra information in
>>>>it, but this quirk allows an RDF graph to have infinitely many
>>>>consequences. For example, a single triple
>>>>
>>>>a p b .
>>>>
>>>>has infinitely many consequences;
>>>>
>>>>_:x p b .
>>>>a p _:y .
>>>>_:z1 p _:z2 .
>>>>_:z3 p _:z4 .
>>>>-:z5 p _:z6 .
>>>>....
>>>>
>>>>where all these bnodes are distinct; see attached jpeg.
>>>>
>>>>My question is, does the WG feel that it might be worth ruling
>>>>this out as a syntactic possibility? If this kind of
>>>>bnode-duplication were ruled out, then the set of graphs simply
>>>>entailed by any RDF graph would be finite.  That would generalize
>>>>the no-duplicate-triples condition implicit in our definition of
>>>>a graph as a set, to treat triples which 'look' the same when you
>>>>erase the bnode labels as though they literally were the same.
>>>>
>>>>Pat

--
---------------------------------------------------------------------
IHMC	(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32501			(850)291 0667    cell
phayes@ihmc.us       http://www.ihmc.us/users/phayes
```
Received on Friday, 13 June 2003 15:20:55 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 14:54:06 UTC