- From: Joshua Allen <joshuaa@microsoft.com>
- Date: Mon, 7 Feb 2005 11:56:00 -0800
- To: "Giovanni Tummarello" <giovanni@wup.it>, <semantic-web@w3.org>
So the caller always has to know the identity of the triples bundle to request replacement of it? > -----Original Message----- > From: Giovanni Tummarello [mailto:giovanni@wup.it] > Sent: Monday, February 07, 2005 8:47 AM > To: Joshua Allen; semantic-web@w3.org > Subject: Re: Sync'ing triplestores > > If you're interested into this specific problem , here is how we do it > in RDFGrowth, with no intention of saying it is the best or even the > right way of doing it :-) > > a) updates are monotonic additions only. > b) statements are "grouped" according to their blank node closures (MSG) > and signed by using a reification on a single statement composing the > closure (its more complicated than this but take this as an explanation) > c) the digital signature hash is a IFP to the MSG > d) updates are managed by distributing a a new MSG that carries the > statement "replace" and the indication of the hash of the old MSG > client decide if to accept the substitution or not according to the > digital signature on the replace MSG: likely they will replace it if the > signature is the same or has a higher hierarchical value. > at this point > d1) in a pure,strictly monotonic P2P environment like the current > RDFGrowth keep the original message as well as the update one.. > d2) in a centralized system safely delete the old version > > Sound complicated? you bet.. but all in all fairly solid, nicely > monotonic so just keep the spammers out (provide a list of accepted > signatures a priory or some kind of authority about who can speak) > 90% implemented, look for the announcement sometimes rather soon. (But > if your boss is really interested maybe we can speed things up a bit 8-) > ) > > Giovanni > > > Joshua Allen wrote: > > >>>I've not had a proper search yet, but was wondering if anyone had > >>> > >>> > >any > > > > > >>>pointers to approaches/algorithms for keeping separate triplestores > >>> > >>> > >in > > > > > >>>sync. Ideally between different implementations - e.g. Jena + > >>> > >>> > >Redland. > > > > > >>Sorry, that wasn't very clear - by sync I mean having (potentially > >>big) models replicated at remote locations. > >> > >> > > > >I haven't found any good comprehensive prior art, but I have been > >thinking about this a lot lately. The general problem is merging models > >(since if the models are disjoint, you don't have an issue). And to > >merge triples, you have to be able to tell whether two triples are > >duplicates (or one is meant to replace the other), or are indeed > >intended to be separate assertions. > > > >If you merged only unique s,p,o combination, you could not handle > >deletes or updates. But without using s,p,o as composite key, you need > >some other way to identify a triple -- a "context". Each store could > >presumably store a URI identifying the source context for each triple, > >but the context identifier would have to be able to flow through all > >stores (it couldn't be store-specific scheme). And the manner in which > >you treat context URI would have to be consistent across all stores. > >For example, if you have one context URI for a single document > >containing a hundred triples, what happens when you update a single > >triple? You need a way to identify that that single triple should be > >deleted from the original context and added to a different one. Even in > >the simple case (a single change results in the old context being > >deleted entirely and replaced with new context) you need a way to > >communicate deletion from one store to another. So I am having a hard > >time envisioning true model merging without some sort of delta encoding > >syntax that is standardized. > > > > > >
Received on Monday, 7 February 2005 19:56:34 UTC