Re: blank nodes (once again)

On Tue, Mar 15, 2011 at 5:30 PM, Ivan Shmakov <ivan@main.uusia.org> wrote:

>
>        Treating this particular issue doesn't seem overly difficult.
>        First, we may split the source graph into subgraphs so that each
>        of the subgraphs will contain only those blank nodes that are
>        linked to each other only via arcs and other blank nodes, and
>        only those non-blank nodes that are exactly one arc away from
>        any of the blank nodes contained.  (A procedure similar to the
>        one described in CBD.)
>
>        Then, serializing the resulting subgraphs using a kind of
>        “canonical” representation (which, in particular, imposes an
>        ordered and stable blank nodes naming), and hashing them, the
>        duplicate statements and whole subgraphs could be detected and
>        removed from the store.
>
This procedure doesn't guarantee that the result graph is lean. As an MSG
might be a subgraph of another.

E.g.:

given the graph:

G1: :joe :has [ a :dog ].

and
G2: :joe :has [a :dog; :name "Silvio"].

Just by not duplicating MSGs this would result in

:joe :has [a :dog; :name "Silvio"]; :has [ a :dog ].

which doesn't expresses more than

:joe :has [a :dog; :name "Silvio"].

A couple of years ago I implemented an open source Graph Versioning System
which is based on graph decomposition and thus very similar to what you
describe above, it is still available at
http://sourceforge.net/projects/jena/files/Graph%20Versioning%20System%20GVS/,
the sources are here: http://jena.svn.sourceforge.net/viewvc/jena/gvs/

Cheers,
Reto

Received on Tuesday, 15 March 2011 17:50:16 UTC