RE: [ANN] RDF Delta : change logging and dataset replication. from Reto Gmür on 2018-06-20 (semantic-web@w3.org from June 2018)

From: Reto Gmür <reto@factsmission.com>
Date: Wed, 20 Jun 2018 08:50:51 +0000
To: Stian Soiland-Reyes <soiland-reyes@manchester.ac.uk>, Andy Seaborne <andy@apache.org>, Semantic Web <semantic-web@w3.org>
Message-ID: <VI1P191MB028727340CBB4A085F566D4EB6770@VI1P191MB0287.EURP191.PROD.OUTLOOK.COM>
Ok, certainly if the bnodes IDs stay stable, things are computationally inexpensive.

But I might revise my 2005 rdf diff tool [1] to generate SPARQL Update statements reflecting the changes in the graph.

Cheers,
Reto




  1.  https://lists.w3.org/Archives/Public/semantic-web/2005Dec/0176.html

From: Stian Soiland-Reyes <soiland-reyes@manchester.ac.uk>
Sent: Monday, June 18, 2018 2:46 PM
To: Reto Gmür <reto@factsmission.com>; Andy Seaborne <andy@apache.org>; Semantic Web <semantic-web@w3.org>
Subject: RE: [ANN] RDF Delta : change logging and dataset replication.


From what I get in https://afs.github.io/rdf-delta/rdf-patch.html#blank-nodes it assumes a “system identifier” that survives multiple patches. This is kind of like a I-know-its-a-bnode-and-so-should-you skolemization (but where the end result is still a bnode).



I see why you raise this, as there would be challenges if you had federated systems that used RDF patches, as you would need to keep track of which ‘system’ a patch picked its identifiers from. Yes, that could go into “H id” as in https://afs.github.io/rdf-delta/rdf-patch-logs.html






I think a select-pattern-based system that would work with isomorphic graphs would be more general (e.g. such patches could be applied to a variety of stores), but probably harder for an RDF store to generate from a simple transaction log. It could also be more computationally expensive to apply.



As this is an RDF Patch update we don’t need any kind of selection, just to deal with known triples separately.





Perhaps it could work by adding an S(elect) operation and E(xists) within a transaction?



Suggested format:





TX .

S _:1 .

E <http://example.com/person1>  <http://schema.org/Person> .

E <http://example.com/person1> <http://schema.org/affiliation> _:1 .

E _:1 <http://schema.org/url> <http://example.com/org1> .

D _:1 <http://schema.org/name> “Fred’s Fish House” .

A _:1 <http://schema.org/name> “Fred’s Soup House” .

TA .



(Using schema.org as example as it relies a lot on bnodes)



Here we (S)elect _:1 as a blank node ID to be bound within this transaction (_:1 is no longer a system identifier).



To restrict which bnode we are talking about, the store would need to match all of the E(xists) statements.  Any non-selected _: identifiers there are NOT free, but are still interpreted as ‘system identifiers’, but you can add multiple S(elections).



Here the transaction would fail if any of the E’s triples/quads fail to exists, or give multiple bindings for _:1. I don’t think it would be appropriate for RDF Patch format to do wildcard bnode selections, e.g. “Delete all bnodes that are organizations..”.



It is not a requirement that every selected bnode is used in A/D, although it would be silly if none of them were used. (This permits you do use intermediate bnodes in the E selection)





It would have to be inside a transaction because such patches are not necessarily idempotent, e.g. the A/D operations might be doing something that breaks the E query and so you can’t run it again.



My proposal would presumably be fairly simple to translate to SPARQL updates.



--
Stian Soiland-Reyes, eScience Lab
School of Computer Science, The University of Manchester
http://orcid.org/0000-0001-9842-9718




From: Reto Gmür<mailto:reto@factsmission.com>
Sent: 18 June 2018 06:55
To: Andy Seaborne<mailto:andy@apache.org>; Semantic Web<mailto:semantic-web@w3.org>
Subject: RE: [ANN] RDF Delta : change logging and dataset replication.


Hi Andy

I'm curious: does this system rely on persistent blanknode ids or can it generate SPARQL Update statements that can be applied to any isomorphic graph?

Cheers,
Reto

> -----Original Message-----
> From: Andy Seaborne <andy@apache.org<mailto:andy@apache.org>>
> Sent: Friday, June 15, 2018 6:36 PM
> To: Semantic Web <semantic-web@w3.org<mailto:semantic-web@w3.org>>
> Subject: [ANN] RDF Delta : change logging and dataset replication.
>
> RDF Delta is a system for recording and publishing changes to RDF Datasets. It
> can be used to create replicas.
>
> https://afs.github.io/rdf-delta/
>
> It is built on top patches and logs which record the changes made to the data.
>
> https://afs.github.io/rdf-delta/rdf-patch.html

>
> One use case is running multiple sync'ed Apache Jena Fuseki servers, for high
> availability or for a request-scalable publishing solution:
>
> https://afs.github.io/rdf-delta/ha-fuseki.html

>
> The current version is 0.4.0.
>
>      Andy
Received on Wednesday, 20 June 2018 08:51:18 UTC