- From: Joshua Allen <joshuaa@microsoft.com>
- Date: Tue, 8 Feb 2005 10:16:13 -0800
- To: "Giovanni Tummarello" <giovanni@wup.it>, <semantic-web@w3.org>
Yeah, I understand. I don't this design is really aimed at "merge" scenarios, then. It's more aimed at permitting people to share a model. Think of a scenario where two different parties are working on separate metadata servers, and want to merge one another's changes -- when changes touch disjoint parts of the model, they merge seamlessly (but in any case, changes could both take place below a bnode, so the identity hash doesn't help). > -----Original Message----- > From: Giovanni Tummarello [mailto:giovanni@wup.it] > Sent: Tuesday, February 08, 2005 3:09 AM > To: Joshua Allen; semantic-web@w3.org > Subject: Re: Sync'ing triplestores > > If i understand correctly your question, yes it does, there is no "blind > imposal" of triples in this sense. > but anyway there is no way to replaces/delete "blundles" of triples > containing blank nodes if you cant somehow "identify" them with a IFP > (in case of a blank node) or a signature hash as we do in case of blank > node closures (there might be more than one linked, and other cases). > One of the nice things of the model we propose is that we're using > standard RDF constructs (e.g. reifications) rather than relaying on > third party proposed additions to RDF semantics like named graphs or > quadruples. > > > >So the caller always has to know the identity of the triples bundle to > >request replacement of it? > > > > > > > >>-----Original Message----- > >>From: Giovanni Tummarello [mailto:giovanni@wup.it] > >>Sent: Monday, February 07, 2005 8:47 AM > >>To: Joshua Allen; semantic-web@w3.org > >>Subject: Re: Sync'ing triplestores > >> > >>If you're interested into this specific problem , here is how we do it > >>in RDFGrowth, with no intention of saying it is the best or even the > >>right way of doing it :-) > >> > >>a) updates are monotonic additions only. > >>b) statements are "grouped" according to their blank node closures > >> > >> > >(MSG) > > > > > >>and signed by using a reification on a single statement composing the > >>closure (its more complicated than this but take this as an > >> > >> > >explanation) > > > > > >>c) the digital signature hash is a IFP to the MSG > >>d) updates are managed by distributing a a new MSG that carries the > >>statement "replace" and the indication of the hash of the old MSG > >>client decide if to accept the substitution or not according to the > >>digital signature on the replace MSG: likely they will replace it if > >> > >> > >the > > > > > >>signature is the same or has a higher hierarchical value. > >>at this point > >>d1) in a pure,strictly monotonic P2P environment like the current > >>RDFGrowth keep the original message as well as the update one.. > >>d2) in a centralized system safely delete the old version > >> > >>Sound complicated? you bet.. but all in all fairly solid, nicely > >>monotonic so just keep the spammers out (provide a list of accepted > >>signatures a priory or some kind of authority about who can speak) > >>90% implemented, look for the announcement sometimes rather soon. (But > >>if your boss is really interested maybe we can speed things up a bit > >> > >> > >8-) > > > > > >>) > >> > >>Giovanni > >> > >> > >>Joshua Allen wrote: > >> > >> > >> > >>>>>I've not had a proper search yet, but was wondering if anyone had > >>>>> > >>>>> > >>>>> > >>>>> > >>>any > >>> > >>> > >>> > >>> > >>>>>pointers to approaches/algorithms for keeping separate triplestores > >>>>> > >>>>> > >>>>> > >>>>> > >>>in > >>> > >>> > >>> > >>> > >>>>>sync. Ideally between different implementations - e.g. Jena + > >>>>> > >>>>> > >>>>> > >>>>> > >>>Redland. > >>> > >>> > >>> > >>> > >>>>Sorry, that wasn't very clear - by sync I mean having (potentially > >>>>big) models replicated at remote locations. > >>>> > >>>> > >>>> > >>>> > >>>I haven't found any good comprehensive prior art, but I have been > >>>thinking about this a lot lately. The general problem is merging > >>> > >>> > >models > > > > > >>>(since if the models are disjoint, you don't have an issue). And to > >>>merge triples, you have to be able to tell whether two triples are > >>>duplicates (or one is meant to replace the other), or are indeed > >>>intended to be separate assertions. > >>> > >>>If you merged only unique s,p,o combination, you could not handle > >>>deletes or updates. But without using s,p,o as composite key, you > >>> > >>> > >need > > > > > >>>some other way to identify a triple -- a "context". Each store could > >>>presumably store a URI identifying the source context for each > >>> > >>> > >triple, > > > > > >>>but the context identifier would have to be able to flow through all > >>>stores (it couldn't be store-specific scheme). And the manner in > >>> > >>> > >which > > > > > >>>you treat context URI would have to be consistent across all stores. > >>>For example, if you have one context URI for a single document > >>>containing a hundred triples, what happens when you update a single > >>>triple? You need a way to identify that that single triple should be > >>>deleted from the original context and added to a different one. Even > >>> > >>> > >in > > > > > >>>the simple case (a single change results in the old context being > >>>deleted entirely and replaced with new context) you need a way to > >>>communicate deletion from one store to another. So I am having a > >>> > >>> > >hard > > > > > >>>time envisioning true model merging without some sort of delta > >>> > >>> > >encoding > > > > > >>>syntax that is standardized. > >>> > >>> > >>> > >>> > >>> > > > > > >
Received on Tuesday, 8 February 2005 18:16:35 UTC