- From: Danny Ayers <danny.ayers@gmail.com>
- Date: Tue, 8 Feb 2005 20:05:44 +0100
- To: Joshua Allen <joshuaa@microsoft.com>
- Cc: Giovanni Tummarello <giovanni@wup.it>, semantic-web@w3.org
Thanks Giovanni, Joshua - I'm reading with interest ;-) btw, one thing I had considered was cheating and using RDBMS-backed (same toolkit) stores, and keeping them synchronized underneath. It seems MySQL 5+ will support the kind of multi-master replication I'm looking for, but right now there's only seems to be master-(readonly)slave. I suspect a fairly naive algorithm on top of SQL might give acceptable performance, but I doubt whether it would be much more effort to do something on top of the RDF layer. Robert Turner suggested using SPARQL, might be a promising angle for a general solution. First on my list though is poking around with RDFGrowth ;-) On Tue, 8 Feb 2005 10:16:13 -0800, Joshua Allen <joshuaa@microsoft.com> wrote: > > Yeah, I understand. I don't this design is really aimed at "merge" > scenarios, then. It's more aimed at permitting people to share a model. > > Think of a scenario where two different parties are working on separate > metadata servers, and want to merge one another's changes -- when > changes touch disjoint parts of the model, they merge seamlessly (but in > any case, changes could both take place below a bnode, so the identity > hash doesn't help). > > > -----Original Message----- > > From: Giovanni Tummarello [mailto:giovanni@wup.it] > > Sent: Tuesday, February 08, 2005 3:09 AM > > To: Joshua Allen; semantic-web@w3.org > > Subject: Re: Sync'ing triplestores > > > > If i understand correctly your question, yes it does, there is no > "blind > > imposal" of triples in this sense. > > but anyway there is no way to replaces/delete "blundles" of triples > > containing blank nodes if you cant somehow "identify" them with a IFP > > (in case of a blank node) or a signature hash as we do in case of > blank > > node closures (there might be more than one linked, and other cases). > > One of the nice things of the model we propose is that we're using > > standard RDF constructs (e.g. reifications) rather than relaying on > > third party proposed additions to RDF semantics like named graphs or > > quadruples. > > > > > > >So the caller always has to know the identity of the triples bundle > to > > >request replacement of it? > > > > > > > > > > > >>-----Original Message----- > > >>From: Giovanni Tummarello [mailto:giovanni@wup.it] > > >>Sent: Monday, February 07, 2005 8:47 AM > > >>To: Joshua Allen; semantic-web@w3.org > > >>Subject: Re: Sync'ing triplestores > > >> > > >>If you're interested into this specific problem , here is how we do > it > > >>in RDFGrowth, with no intention of saying it is the best or even the > > >>right way of doing it :-) > > >> > > >>a) updates are monotonic additions only. > > >>b) statements are "grouped" according to their blank node closures > > >> > > >> > > >(MSG) > > > > > > > > >>and signed by using a reification on a single statement composing > the > > >>closure (its more complicated than this but take this as an > > >> > > >> > > >explanation) > > > > > > > > >>c) the digital signature hash is a IFP to the MSG > > >>d) updates are managed by distributing a a new MSG that carries the > > >>statement "replace" and the indication of the hash of the old MSG > > >>client decide if to accept the substitution or not according to the > > >>digital signature on the replace MSG: likely they will replace it if > > >> > > >> > > >the > > > > > > > > >>signature is the same or has a higher hierarchical value. > > >>at this point > > >>d1) in a pure,strictly monotonic P2P environment like the current > > >>RDFGrowth keep the original message as well as the update one.. > > >>d2) in a centralized system safely delete the old version > > >> > > >>Sound complicated? you bet.. but all in all fairly solid, nicely > > >>monotonic so just keep the spammers out (provide a list of accepted > > >>signatures a priory or some kind of authority about who can speak) > > >>90% implemented, look for the announcement sometimes rather soon. > (But > > >>if your boss is really interested maybe we can speed things up a bit > > >> > > >> > > >8-) > > > > > > > > >>) > > >> > > >>Giovanni > > >> > > >> > > >>Joshua Allen wrote: > > >> > > >> > > >> > > >>>>>I've not had a proper search yet, but was wondering if anyone had > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>any > > >>> > > >>> > > >>> > > >>> > > >>>>>pointers to approaches/algorithms for keeping separate > triplestores > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>in > > >>> > > >>> > > >>> > > >>> > > >>>>>sync. Ideally between different implementations - e.g. Jena + > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>Redland. > > >>> > > >>> > > >>> > > >>> > > >>>>Sorry, that wasn't very clear - by sync I mean having (potentially > > >>>>big) models replicated at remote locations. > > >>>> > > >>>> > > >>>> > > >>>> > > >>>I haven't found any good comprehensive prior art, but I have been > > >>>thinking about this a lot lately. The general problem is merging > > >>> > > >>> > > >models > > > > > > > > >>>(since if the models are disjoint, you don't have an issue). And > to > > >>>merge triples, you have to be able to tell whether two triples are > > >>>duplicates (or one is meant to replace the other), or are indeed > > >>>intended to be separate assertions. > > >>> > > >>>If you merged only unique s,p,o combination, you could not handle > > >>>deletes or updates. But without using s,p,o as composite key, you > > >>> > > >>> > > >need > > > > > > > > >>>some other way to identify a triple -- a "context". Each store > could > > >>>presumably store a URI identifying the source context for each > > >>> > > >>> > > >triple, > > > > > > > > >>>but the context identifier would have to be able to flow through > all > > >>>stores (it couldn't be store-specific scheme). And the manner in > > >>> > > >>> > > >which > > > > > > > > >>>you treat context URI would have to be consistent across all > stores. > > >>>For example, if you have one context URI for a single document > > >>>containing a hundred triples, what happens when you update a single > > >>>triple? You need a way to identify that that single triple should > be > > >>>deleted from the original context and added to a different one. > Even > > >>> > > >>> > > >in > > > > > > > > >>>the simple case (a single change results in the old context being > > >>>deleted entirely and replaced with new context) you need a way to > > >>>communicate deletion from one store to another. So I am having a > > >>> > > >>> > > >hard > > > > > > > > >>>time envisioning true model merging without some sort of delta > > >>> > > >>> > > >encoding > > > > > > > > >>>syntax that is standardized. > > >>> > > >>> > > >>> > > >>> > > >>> > > > > > > > > > > > -- http://dannyayers.com
Received on Tuesday, 8 February 2005 19:05:45 UTC