Re: distribution and federation

on this topic, elf Pavlik reminded me that Pieter is one of the first 
who joined this group, and that he will surely like to introduce LDES 
before we digging more into this very interesting subject!

On 21/11/24 3:30 am, Niko - NextGraph wrote:
>
> Hello everyone,
>
> In the matrix room of Solid/Specification, elf Pavlik shared recently 
> 2 links that could be of interest to the topic of CRDT, or at least, 
> to my understanding, about distribution and federation of RDF data.
>
> https://treecg.github.io/specification/ 
> <https://treecg.github.io/specification/>
>
> https://tree.linkeddatafragments.org/linked-data-event-streams/ 
> <https://tree.linkeddatafragments.org/linked-data-event-streams/>
>
> I'll give here below my understanding of those 2 specs.
>
> We have the chance to have among us here Hala Skaf-Molli who took part 
> in the 2 research papers I mention further down.
>
> This "TREE" spec is amazing, and I have been looking for something 
> like that for many years!
>
> This spec, to my understanding, is about sharding and distribution of 
> data in a complex network of data repositories, with a capability to 
> search datasets with some parameters. It is very useful when data is 
> distributed. The Linked data event streams spec (LDES) which is from 
> the same people and relates to the TREE spec, supports an append-only 
> collection of immutable records. We can see in the examples that they 
> use the concept of `versions` of the records that supersedes each 
> other in the stream, if needed.
> Also if the stream definition itself (the shape by example) needs to 
> change, they have a note saying in the specs that the new shape should 
> be backward compatible, or that a fork is needed.
>
> Nowhere in those 2 specs is the concept of "merging conflicts" 
> present. They elude the question of conflict, and I suppose, based 
> their conflict resolution on the timestamps that I see everywhere in 
> the given examples. Which makes it a LWW (last write wins)... which is 
> the poorest guarantee you can get, and does not really qualify, in my 
> opinion, for a CRDT.
> But the spec is really interesting about sharding and distribution of 
> data.
>
> In fact it could complement the work done by Pascal and Hala Molli et 
> al., on the problem of source selection and federated queries, that 
> they addressed recently with DeKaloG 
> https://hal.science/hal-03936036/document and FedUP 
> https://hal.science/hal-04538238/document

>
> Those topics are of high importance when we want to consider 
> scalability and global search in a decentralized system.
>
> CRDT is about automatic conflict resolution, which is a related topic 
> to federation and distribution, but is essentially different too, as 
> it concerns updates and their consistency, while what we see here is 
> more concerned about read, search and discoverability patterns.
>

Received on Thursday, 21 November 2024 01:39:52 UTC