Re: Sync'ing triplestores from Leo Sauermann on 2005-04-05 (semantic-web@w3.org from April 2005)

From: Leo Sauermann <leo@gnowsis.com>
Date: Tue, 05 Apr 2005 13:09:37 +0100
To: Bill de hÓra <bill.dehora@propylon.com>
CC: semantic-web@w3.org, Danny Ayers <danny.ayers@gmail.com>, joshuaa@microsoft.com
Message-ID: <42528001.3060202@gnowsis.com>
http://www.w3.org/DesignIssues/Diff

but did anybody implement it yet ??
:-)

so I still can't sync my sesame with my joseki with my RDF Gateway. 
dooomed we are!

Es begab sich aber zu der Zeit 03.04.2005 14:23,  da Bill de hÓra schrieb:

>
> Joshua Allen:
> "So the real problem of merging RDF stores is in being able to uniquely
> identify chunks of RDF independent of their full content.  It seems the
> options here are very limited, without going into some crazy "merge
> definition language" in the RDFS or OWL.  Even if you have a simple
> RDFS/OWL property which tells you a combination of child tuples which
> uniquely identify the graph, you still have the problem that an update
> replaces the entire graph; when you probably want it to merge only
> properties that have been changed (if I update only the e-mail address,
> and send a graph that has the old postal address, I do not want my
> update to replace the current postal address).  So to accomplish this,
> you need a delta encoding syntax with change tracking (send a statement;
> "update the following triple on the node identified by this key, and
> ignore everything else under that node").  Basically a DML for RDF.  To
> expect all stores to support change tracking and a standardized DML is
> pretty crazy.  We don't even do that in SQL land."
>
> I've come to this very late - Danny mentioned syncing triple stores to 
> me recently as an aside to something else.
>
> The problem with syncing graphs seems to be, that to do it properly, 
> you need to compute the respective graph complements, which could be a 
> very expensive operation.
>
> So, I'll make an assumption; that striving for exact syncing of 
> triplestores is one of those Internet type fallacies (ie along the 
> lines of the 'network is reliable', or 'long-lived transactions'), but 
> at the level of data rather than networking. We have a few such 
> fallacies already for RDF.
>
> I would then lower my expectations to a best effort at sharing new 
> interesting data between agents. The simplest way to do this seems to 
> be for stores to expose a triples feed. That is, a store would publish 
> all new deletes, updates and inserts as a data stream. That way, any 
> other store's agent can subscribe to the feed.
>
> Writing an RDF/XML content model to describe whether the change is an 
> update, delete or insert should be straightforward, modulo that it 
> would be a statement about a statement. But because the usage and 
> intent is so specific, it would not be a problem to license an 
> application to 'lift' the target statement to something asserted. For 
> example, I'm pretty sure I can alter an RDF event model I have for 
> just that purpose (Danny, you've seen that event model before).  Once 
> the change data is described (RDF/XML) and packaged (RSS1.0/Atom), you 
> can think about the delivery protocol. HTTP and XMPP+PubSub come to mind.
>
> The upside of this, aside from looking like a tractable problem, is 
> that subscribers can choose what to update and what not and also that 
> conflict resolution is kept local to the stores (again, I would class 
> interoperable resolution protocols as non-workable on the Internet 
> level right now and maybe for ever). It might lack the precision those 
> coming from the enterprise database background would expect or insist 
> upon, but there is a history of failure in regard to getting 
> enterprise approaches to work on the Internet.
>
> cheers
> Bill
>
>
>
>
Received on Tuesday, 5 April 2005 12:09:51 UTC