Re: Support for Multiple Graphs and Graph Stores

On Fri, 2012-02-03 at 14:00 -0800, michael wrote:
> On 2/3/2012 12:55 PM, David Robillard wrote:
> > On Fri, 2012-02-03 at 12:25 -0800, michael wrote:
> > [...]
> >> Some revision control terms have already been mentioned by the WG: branches[6][7], trees[7], patches[8][9] and assertions[4][10]. Here is a comparison
> >> of Sandro's g-* terminology[11] with a popular DVCS, Git and some terms I suggest for RDF graph management(RDFGM):
> >>
> >> g-*     Git         RDFGM          Description
> >> ----------------------------------------------------------------------------
> >> g-text  patch       graph literal  serialized set of RDF statements, triples
[...]
> >> A g-text is the serialized content of a RDF graph[11], aka triples. This is similar to a patch in a revision control system. I prefer the term graph
> >> literal which is a more accurate description.
[...]
> > A "patch" for RDF is not the same as a graph literal, a patch for RDF
> > would be vaguely similar to PROPPATCH and require at least *two* graphs:
> > the set of triples removed, and the set of triples added.
[...]
> I agree that a patch/diff implies change whereas a graph literal does not. This is a good observation that a patch/diff must specify both additions 
> and deletions. I did not intend to equate the complete description of the new version with a patch. The analogy I was trying to draw was that they 
> both exist at the level of serialized statements.
> 
> Perhaps a better comparison would be to a file and introduce a patch(Diff) as its own separate entity:
> 
> g-*     Git         RDFGM          Description
> ----------------------------------------------------------------------------
> g-text  file        graph literal  serialized set of RDF statements, triples
> g-snap  blob        Graph          set of RDF statements
>          tree        Dataset        description of one or more graphs/datasets
>          patch       Diff           a changeset
>          commit      Assertion      provenance for a dataset/patch
> g-box   branch      Branch         dataset of assertions / label for assertions
>          repository  Repository     set of graphs and their metadata
>          git         Store          an engine that provides access to repositories
> ...
> Diff's allow for small changes to be easily expressed. They describe three, possibly four, datasets/graphs: 1) a base dataset/graph 2) additions 3) 
> deletions. Optionally, the diff may also describe a fourth dataset/graph that is the result of the changes. An assertion can then refer to the diff to 
> describe the provenance for that limited set of changes, allowing for more fine grained tracking of provenance data. Diffs, as with graphs and 
> datasets, are also immutable.
> ...

These both ("file" analogy, and separate "diff" concept) make sense to
me.

Interesting to compare diff and assertion.  They seem to be equivalent
in terms of what is described, an assertion is just a diff with a named
"result" graph, as you mentioned:

> Optionally, the diff may also describe a fourth dataset/graph that is
> the result of the changes.

So, I would say either a Diff and an Assertion are the same thing, or an
Assertion is a Diff with this "result" property (which means it
describes the provenance of that result graph), i.e. a Diff does not
have this property.

> Thanks for pointing that out!

You're welcome.

-dr

Received on Friday, 3 February 2012 22:35:03 UTC