- From: Giovanni Tummarello <giovanni@wup.it>
- Date: Mon, 07 Feb 2005 17:47:12 +0100
- To: Joshua Allen <joshuaa@microsoft.com>, semantic-web@w3.org
If you're interested into this specific problem , here is how we do it in RDFGrowth, with no intention of saying it is the best or even the right way of doing it :-) a) updates are monotonic additions only. b) statements are "grouped" according to their blank node closures (MSG) and signed by using a reification on a single statement composing the closure (its more complicated than this but take this as an explanation) c) the digital signature hash is a IFP to the MSG d) updates are managed by distributing a a new MSG that carries the statement "replace" and the indication of the hash of the old MSG client decide if to accept the substitution or not according to the digital signature on the replace MSG: likely they will replace it if the signature is the same or has a higher hierarchical value. at this point d1) in a pure,strictly monotonic P2P environment like the current RDFGrowth keep the original message as well as the update one.. d2) in a centralized system safely delete the old version Sound complicated? you bet.. but all in all fairly solid, nicely monotonic so just keep the spammers out (provide a list of accepted signatures a priory or some kind of authority about who can speak) 90% implemented, look for the announcement sometimes rather soon. (But if your boss is really interested maybe we can speed things up a bit 8-) ) Giovanni Joshua Allen wrote: >>>I've not had a proper search yet, but was wondering if anyone had >>> >>> >any > > >>>pointers to approaches/algorithms for keeping separate triplestores >>> >>> >in > > >>>sync. Ideally between different implementations - e.g. Jena + >>> >>> >Redland. > > >>Sorry, that wasn't very clear - by sync I mean having (potentially >>big) models replicated at remote locations. >> >> > >I haven't found any good comprehensive prior art, but I have been >thinking about this a lot lately. The general problem is merging models >(since if the models are disjoint, you don't have an issue). And to >merge triples, you have to be able to tell whether two triples are >duplicates (or one is meant to replace the other), or are indeed >intended to be separate assertions. > >If you merged only unique s,p,o combination, you could not handle >deletes or updates. But without using s,p,o as composite key, you need >some other way to identify a triple -- a "context". Each store could >presumably store a URI identifying the source context for each triple, >but the context identifier would have to be able to flow through all >stores (it couldn't be store-specific scheme). And the manner in which >you treat context URI would have to be consistent across all stores. >For example, if you have one context URI for a single document >containing a hundred triples, what happens when you update a single >triple? You need a way to identify that that single triple should be >deleted from the original context and added to a different one. Even in >the simple case (a single change results in the old context being >deleted entirely and replaced with new context) you need a way to >communicate deletion from one store to another. So I am having a hard >time envisioning true model merging without some sort of delta encoding >syntax that is standardized. > > >
Received on Monday, 7 February 2005 16:47:38 UTC