- From: Brent Shambaugh <brent.shambaugh@gmail.com>
- Date: Tue, 7 Apr 2015 18:49:25 -0500
- To: Adrian Hope-Bailie <adrian@hopebailie.com>
- Cc: David Nicol <davidnicol@gmail.com>, Melvin Carvalho <melvincarvalho@gmail.com>, Anders Rundgren <anders.rundgren.net@gmail.com>, Web Payments <public-webpayments@w3.org>
- Message-ID: <CACvcBVrQjBQtCVNMon6EkfVi6nb5Jdj+xtDO6xwdBfbV5n21Lg@mail.gmail.com>
-Brent Shambaugh Website: bshambaugh.org On Tue, Apr 7, 2015 at 5:27 PM, Adrian Hope-Bailie <adrian@hopebailie.com> wrote: > I don't think availability of suitable technology is the problem. > There are numerous options and numerous deployments of these. > That is exactly the problem. > > A discovery protocol must either pick one datastore or pick many > datastores and search them all. > If it searches many of these datastores for the data it is trying to find > what order does it follow and does it stop when it finds it's first match > or does it search them all and then have some rules for picking the most > correct match? > > These are hard problems which today are glossed over by the recommendation > to "use telehash". > > Any clever ideas about how this can be overcome? > I haven't implemented these sorts of things. What immediately comes to mind: (1) swarm intelligence and (2) taking advantage of the semantic nature of the data for clustering. However, I will depart from this for a second. Telehash adapts the Kadmelia DHT. According to Wikipedia, "Kademlia uses a "distance" calculation between two nodes. This distance is computed as the exclusive or <http://en.wikipedia.org/wiki/Exclusive_or> of the two node IDs, taking the result as an integer number <http://en.wikipedia.org/wiki/Integer>.". ( http://en.wikipedia.org/wiki/Kademlia) >From https://github.com/telehash/telehash.org/blob/master/v2/dht.md : " Telehash adapts the Kademlia <https://github.com/telehash/telehash.org/blob/master/v2/references.md> Distributed Hash Table for its peer discovery. A "peer" in this document is a single application instance, one unique hashname. Unlike the original Kademlia paper that was a key-value store, there is no arbitrary data stored in the DHT. Peers query the DHT purely to locate other peers, independent of IP or other transient network identifiers. Telehash also departs from Kademlia by using SHA2 256-bit hashes (rather than SHA1 160-bit hashes). Like any DHT, telehash peers cooperatively store all network information while minimizing per-peer costs. Derived from Kademlia's hash-based addressing and distance calculations, the average number of "nearby" peers will grow logarithmically compared to the total global number of peers. Peers then attempt to keep track of all of the closest peers, and progressively fewer of the farther away peers. This pattern minimizes "degrees of separation" between peers while also minimizing the number of other peers each indidivual peer must keep track of. Like Kademlia, telehash measures distance with a bitwise XOR metric which divides the address space into 256 possible partitions, also called k-buckets. Every peer when compared will have a bucket value based on the bit that differs, if the first bit is different the bucket would be 255, and if the entire first byte is the same the bucket for that peer would be 247." For (2) I would like to find out if using semantic information from linked data would be useful instead of a bitwise XOR metric. INGA uses semantic information in four layers: "A peer responds to a query by providing an answer matching the query or by forwarding the query to relevant remote peers. The local peer determines the relevance of a remote peer based on a personal semantic shortcut index. The index is created and maintained in a lazy manner, i.e., by analyzing the queries initiated by the local peer and by analyzing the queries that are routed through the local peer. INGA creates shortcuts on four layers: The content provider layer contains shortcuts to remote peers which have successfully answered a query; the recommender layer stores information about remote peers who have issued a query; the bootstrapping layer maintains shortcuts to well connected remote peers; and the network layer connects to peers on an underlying default network." A Loser et al, Semantic Social Overlay Networks ( http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.72.7668&rep=rep1&type=pdf ) This leads to a question of using linked data as shortcuts to other peers. How well would it fit into this model? A later part of the paper provides a start: "Conjunctive queries. Each query may include several pred- icates, e.g. Select all resources that belong to the topic semantic web and to the topic p2p. Using a common topic hierarchy this query can be rewritten as Find any resource having topics /computer/web/semanticweb and /com- puter/distributed/TourismTechnology. An exact match ap- proach routes a query only to a peer that matches all predicates of the query using a simple exact match paradigm." Considering (1) for swarm intelligence, I am reminded of Sebastian Koske's Thesis, SwarmLinda is mentioned on page 34-35 that allows for self organization into clusters. The next sections are summarized on page 40: "In the next sections, swarm-based approaches are introduced, which provide support for typed templates (allowing typed triple retrieval), as they cluster statements semantically and form thematically confined areas within the Triple Space.: Could you combine both? It seems you would want to cluster similar things while providing hints at what might be in other places. Then you could add DHT to this for storage if you wanted? For the record, SwarmLinda uses tuplespace. ( http://en.wikipedia.org/wiki/Tuple_space) > > On 7 April 2015 at 11:22, David Nicol <davidnicol@gmail.com> wrote: > >> use http://www.libtorrent.org/dht_store.html to store a verified >> ledger. Start by adapting the BTC blockchain to dht_store access. >> Devise a mechanism for trusting providers of cached ledger query >> responses. >> >> >> > What is missing is a decentralised data store that can serve as the >> registry >> > for these identities. The Credentials CG has proposed Telehash as this >> > data-store. >> > The challenge is that one then has to be explicit in defining the >> discovery >> > protocol as to which decentralised data store to use. >> >> > If someone proposed the namecoin block-chain as an alternative how do we >> > decide which to use? >> > Who will the stewards of this decentralised data store? >> > Is there an architecture for this data store that would be >> rubber-stamped by >> > the W3C as a cornerstone for dependent recommendations? >> > (Here I am trying to think of an architecture that incentivises >> participants >> > to maintain the network assuming that financial incentives aren't >> practical) >> > >
Received on Tuesday, 7 April 2015 23:49:53 UTC