- From: Niko - NextGraph <niko@nextgraph.org>
- Date: Thu, 21 Nov 2024 03:30:59 +0200
- To: public-crdt4rdf@w3.org
- Message-ID: <36db4c47-532e-4444-8ed1-c570cb47ce59@nextgraph.org>
Hello everyone, In the matrix room of Solid/Specification, elf Pavlik shared recently 2 links that could be of interest to the topic of CRDT, or at least, to my understanding, about distribution and federation of RDF data. https://treecg.github.io/specification/ <https://treecg.github.io/specification/> https://tree.linkeddatafragments.org/linked-data-event-streams/ <https://tree.linkeddatafragments.org/linked-data-event-streams/> I'll give here below my understanding of those 2 specs. We have the chance to have among us here Hala Skaf-Molli who took part in the 2 research papers I mention further down. This "TREE" spec is amazing, and I have been looking for something like that for many years! This spec, to my understanding, is about sharding and distribution of data in a complex network of data repositories, with a capability to search datasets with some parameters. It is very useful when data is distributed. The Linked data event streams spec (LDES) which is from the same people and relates to the TREE spec, supports an append-only collection of immutable records. We can see in the examples that they use the concept of `versions` of the records that supersedes each other in the stream, if needed. Also if the stream definition itself (the shape by example) needs to change, they have a note saying in the specs that the new shape should be backward compatible, or that a fork is needed. Nowhere in those 2 specs is the concept of "merging conflicts" present. They elude the question of conflict, and I suppose, based their conflict resolution on the timestamps that I see everywhere in the given examples. Which makes it a LWW (last write wins)... which is the poorest guarantee you can get, and does not really qualify, in my opinion, for a CRDT. But the spec is really interesting about sharding and distribution of data. In fact it could complement the work done by Pascal and Hala Molli et al., on the problem of source selection and federated queries, that they addressed recently with DeKaloG https://hal.science/hal-03936036/document and FedUP https://hal.science/hal-04538238/document Those topics are of high importance when we want to consider scalability and global search in a decentralized system. CRDT is about automatic conflict resolution, which is a related topic to federation and distribution, but is essentially different too, as it concerns updates and their consistency, while what we see here is more concerned about read, search and discoverability patterns.
Attachments
- application/pgp-keys attachment: OpenPGP public key
Received on Thursday, 21 November 2024 01:31:25 UTC