RE: Sync'ing triplestores

So the caller always has to know the identity of the triples bundle to
request replacement of it?

> -----Original Message-----
> From: Giovanni Tummarello [mailto:giovanni@wup.it]
> Sent: Monday, February 07, 2005 8:47 AM
> To: Joshua Allen; semantic-web@w3.org
> Subject: Re: Sync'ing triplestores
> 
> If you're interested into this specific problem , here is how we do it
> in RDFGrowth, with no intention of saying it is the best or even the
> right way of doing it :-)
> 
> a) updates are monotonic additions only.
> b) statements are "grouped" according to their blank node closures
(MSG)
> and signed by using a reification on a single statement composing the
> closure (its more complicated than this but take this as an
explanation)
> c) the digital signature hash is a IFP to the MSG
> d) updates are managed by distributing a a new MSG that carries the
> statement "replace" and the indication of the hash of the old MSG
> client decide if to accept the substitution or not according to the
> digital signature on the replace MSG: likely they will replace it if
the
> signature is the same or has a higher hierarchical value.
> at this point
> d1) in a pure,strictly monotonic  P2P environment like the current
> RDFGrowth keep the original message as well as the update one..
> d2) in a centralized system safely delete the old version
> 
> Sound complicated? you bet.. but all in all fairly solid, nicely
> monotonic so just keep the spammers out (provide a list of accepted
> signatures a priory or some kind of authority about who can speak)
> 90% implemented, look for the announcement sometimes rather soon. (But
> if your boss is really interested maybe we can speed things up a bit
8-)
> )
> 
> Giovanni
> 
> 
> Joshua Allen wrote:
> 
> >>>I've not had a proper search yet, but was wondering if anyone had
> >>>
> >>>
> >any
> >
> >
> >>>pointers to approaches/algorithms for keeping separate triplestores
> >>>
> >>>
> >in
> >
> >
> >>>sync. Ideally between different implementations - e.g. Jena +
> >>>
> >>>
> >Redland.
> >
> >
> >>Sorry, that wasn't very clear - by sync I mean having (potentially
> >>big) models replicated at remote locations.
> >>
> >>
> >
> >I haven't found any good comprehensive prior art, but I have been
> >thinking about this a lot lately.  The general problem is merging
models
> >(since if the models are disjoint, you don't have an issue).  And to
> >merge triples, you have to be able to tell whether two triples are
> >duplicates (or one is meant to replace the other), or are indeed
> >intended to be separate assertions.
> >
> >If you merged only unique s,p,o combination, you could not handle
> >deletes or updates.  But without using s,p,o as composite key, you
need
> >some other way to identify a triple -- a "context".  Each store could
> >presumably store a URI identifying the source context for each
triple,
> >but the context identifier would have to be able to flow through all
> >stores (it couldn't be store-specific scheme).  And the manner in
which
> >you treat context URI would have to be consistent across all stores.
> >For example, if you have one context URI for a single document
> >containing a hundred triples, what happens when you update a single
> >triple?  You need a way to identify that that single triple should be
> >deleted from the original context and added to a different one.  Even
in
> >the simple case (a single change results in the old context being
> >deleted entirely and replaced with new context) you need a way to
> >communicate deletion from one store to another.  So I am having a
hard
> >time envisioning true model merging without some sort of delta
encoding
> >syntax that is standardized.
> >
> >
> >

Received on Monday, 7 February 2005 19:56:34 UTC