- From: Evan Prodromou <evan@prodromou.name>
- Date: Thu, 23 Jan 2025 10:38:55 -0500
- To: Adam Sobieski <adamsobieski@hotmail.com>, "Emelia S." <emelia@brandedcode.com>
- Cc: "public-swicg@w3c.org" <public-swicg@w3c.org>
- Message-ID: <48b90625-1f77-45ef-b3a3-d31d7ebe9624@prodromou.name>
Hey, Adam. So, I'd prefer to use methods where the fact-checking information is distributed via ActivityPub. Evan On 2025-01-14 6:31 p.m., Adam Sobieski wrote: > Social Web Incubator Community Group, > > Hello. I am pleased to share some preliminary brainstorming and ideas > about decentralized fact-checking and argumentation using P2P > filesharing networks. > Hopefully some of the following ideas can be of use for the Fediverse, > e.g., for the discovery of existing annotations. > > > Introduction > > > With respect to sharing Web Annotations, uses of P2P networks have > been previously explored (Segawa, 2006). Providing users with access > to these kinds of networks from their Web browsers, today, is > possible with WebRTC (Werner & Vogt, 2014; Ersson & Siri, 2015). > > P2P filesharing networks could be of use for decentralized > fact-checking and argumentation. Facts or claims could be stored in > entries, a special kind of file resource. > By creating and sharing digitally-signed user feedback, notes, > comments, or annotations with respect to those facts or claims in > entries, users could express their determinations with respect to the > veracity of facts or claims and could also present arguments for or > against them (Bex, Snaith, Lawrence, & Reed, 2014). > Entries could contain one or more references to paraphrases of content > from locations on the Fediverse (see: Appendix A). Annotation objects > from the Fediverse could be indexed and redundantly stored on P2P > filesharing networks. > > > Uses of Embedding Vectors > > Instead of, or in addition to, using cryptographic hashes to index and > address content on P2P networks, digitally-signed entries for facts or > claims could be indexed and addressed using embedding vectors (Zaarour > & Curry, 2022). > As considered, entries would be a special kind of file resource where > their embedding vectors, embedding vectors verifiably for selections > of other resources' contents, would be stored inside of them (see: > Appendix A) rather than obtained from processing them with AI models. > Indexing and addressing entries thusly would allow them to be merged > or wrapped, e.g., to add paraphrases, digitally signing them at each > step, without having to reindex them. Modifications, however, would > result in changes to entries' cryptographic hashes. > Deep learning can be used to detect and identify sentential > paraphrases (Zhou, Qiu, Liang, & Acuna, 2022). More elaborate uses of > language models could be utilized for inquiring and reasoning about > whether sentences occurring in contexts were paraphrases. > With respect to fact-checking on the Web, scenarios to consider > include both fact-checking content which was expressly indicated to be > a fact or claim by their authors, e.g., using custom elements, and > fact-checking arbitrary selections of documents' content. > Explorations with respect to fact-checking arbitrary selections of > content include the open-source Citation Needed project by the Future > Audiences team of the Wikimedia Foundation. > > > The Prompt API > > Exploration is underway into providing APIs for accessing language > models in Web browsers; the Web Machine Learning Working Group is > developing the Prompt API. > With access to language models in Web browsers, users might be able to > obtain embedding vectors for portions of content in Web documents. > These embedding vectors could be used to search for other content, > e.g., annotations, including on P2P networks. > > > Custom Elements > > > HTML5 custom elements could allow facts or claims to be expressed in > documents, e.g., to add visual indictors near them or enable special > context menus for them, while specifying values for embedding > vectors computed for them using AI models (see: Appendix C). > > > Appendices > > Appendix A shows a markup sketch for an entry, a created entry wrapped > to add a paraphrase to it. > Appendix B shows that embedding vectors could be added to Magnet URIs > and Metalinks. > Appendix C shows that HTML5 custom elements could be used for asserted > facts or claims which refer to entries on P2P networks by means of one > or more embedding vectors. > Appendix D shows an approach involving shortcodes for authors using > content-management systems to be able to easily add facts or claims to > their content. > > > Bibliography > > Bex, Floris, Mark Snaith, John Lawrence, and Chris Reed. > "ArguBlogging: An application for the argument web." /Journal of Web > Semantics/ 25 (2014): 9-15. > https://www.sciencedirect.com/science/article/pii/S1570826814000079 > <https://www.sciencedirect.com/science/article/pii/S1570826814000079> > Ersson, Kerstin, and Persson Siri. "Peer-to-peer distribution of web > content using WebRTC within a web browser." (2015). > https://www.diva-portal.org/smash/get/diva2:819420/FULLTEXT01.pdf > <https://www.diva-portal.org/smash/get/diva2:819420/FULLTEXT01.pdf> > Segawa, Osamu. "Web annotation sharing using P2P." In /Proceedings of > the 15th international conference on World Wide Web/, pp. 851-852. > 2006. > http://ra.ethz.ch/CDstore/www2006/devel-www2006.ecs.soton.ac.uk/programme/files/pdf/p45.pdf > <http://ra.ethz.ch/CDstore/www2006/devel-www2006.ecs.soton.ac.uk/programme/files/pdf/p45.pdf> > Werner, Max Jonas, and Christian Vogt. "Implementation of a > browser-based P2P network using WebRTC." /Hamburg/ (2014). > https://inet.haw-hamburg.de/teaching/ws-2013-14/master-project/Prj1-report-werner-vogt.pdf > <https://inet.haw-hamburg.de/teaching/ws-2013-14/master-project/Prj1-report-werner-vogt.pdf> > Zaarour, Tarek, and Edward Curry. "SemanticPeer: A distributional > semantic peer-to-peer lookup protocol for large content spaces at > internet-scale." /Future Generation Computer Systems/ 132 (2022): > 239-253. > https://www.sciencedirect.com/science/article/pii/S0167739X22000590 > <https://www.sciencedirect.com/science/article/pii/S0167739X22000590> > Zhou, Chao, Cheng Qiu, Lizhen Liang, and Daniel E. Acuna. "Paraphrase > identification with deep learning: A review of datasets and methods." > /arXiv preprint arXiv:2212.06933/ (2022). > https://arxiv.org/pdf/2212.06933 <https://arxiv.org/pdf/2212.06933> > > > Appendix A: Sketch of an Entry for a Fact or Claim > > <action kind="add-paraphrase"> > > <base> > > <action kind="create"> > > <base /> > > <time>2024-01-14T00:01:00Z</time> > > <v id="v-1" model=" urn:ai:model:llama:3.2:90B">...</v> > > <metalink id="source-1"> > > <file name="article1.html"> > > <url>https://www.example1.com/user1/article1.html</url> > > </file> > > </metalink> > > <selection source="source-1"> > > ... <select v="v-1">A sentence.</select> ... > > </selection> > > <signature>...</signature> > > </action> > > </base> > > <time>2024-01-14T00:00:00Z</time> > > <v id="v-2" model="urn:ai:model:llama:3.3:70B">...</v> > > <metalink id="source-2"> > > <file name="article2.html"> > > <url>https://www.example2.com/user2/article2.html</url> > > </file> > > </metalink> > > <selection source="source-2"> > > ... <select v="v-1 v-2">A paraphrase.</select> ... > > </selection> > > <signature>...</signature> > > </action> > > > Appendix B: Adding Embedding Vectors to Magnet URIs and Metalinks > > > Embedding vectors could be added to Magnet URIs by means of adding a > key: xv. > > Embedding vectors could be new components of metalinks. > <metalink xmlns="urn:ietf:params:xml:ns:metalink"> > <published>2009-05-15T12:23:23Z</published> > <file name="example.txt"> > <url>http://www.example.com/example.txt</url> > <vector model="urn:ai:model:llama:3.3:70B">...</vector> > </file> > </metalink> > > > Appendix C: Custom Elements for Facts or Claims > > > A custom element could be used to signify an asserted fact or claim, > referring to an entry on a P2P network by means of embedding vectors > alongside other information. Via a JavaScript library, and perhaps > WebRTC, clients could participate in P2P networks and retrieve > entries, feedback on entries, or both. > > Notice that, for the special file type of entries, those embedding > vectors within them and not of the XML file itself are utilized with > respect to storing and addressing the resource on P2P networks. > <verifiable-claim see="magnet:?xv=...">Ut enim ad minim veniam, quis > nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo > consequat.</verifiable-claim> > > > Appendix D: Content Authoring with Shortcodes > > > How might authors easily add facts or claims to their content? With > respect to popular content-management systems, the syntax for so > doing could resemble that of existing shortcodes like [quote]. > > [claim]Ut enim ad minim veniam, quis nostrud exercitation ullamco > laboris nisi ut aliquip ex ea commodo consequat.[/claim] > During content-publishing processes, authors' content-management > systems (e.g., Drupal, WordPress) – or configurable plugins or > extensions for these systems – could handle searching for existing > paraphrases, adding new facts or claims (if needed) to P2P filesharing > networks, obtaining the data for use in the see attributes, caching > these data, and generating markup. > > ------------------------------------------------------------------------ > *From:* Emelia S. <emelia@brandedcode.com> > *Sent:* Monday, January 13, 2025 11:21 AM > *To:* Evan Prodromou <evan@prodromou.name> > *Cc:* public-swicg@w3c.org <public-swicg@w3c.org> > *Subject:* Re: Fact-checking and community notes on the Fediverse > This is already something on the list of things that the ActivityPub > Trust & Safety Taskforce is working on: > > 4.png > Idea: Annotations / Labeling of content · Issue #4 · > swicg/activitypub-trust-and-safety > <https://github.com/swicg/activitypub-trust-and-safety/issues/4> > github.com > <https://github.com/swicg/activitypub-trust-and-safety/issues/4> > > <https://github.com/swicg/activitypub-trust-and-safety/issues/4> > > The Web Annotations model could work, but the discovery of annotations > that exist is the hardest part, I've started solving that in > https://github.com/ThisIsMissEm/annotations-service where I use the > sha256 hash of the Object ID as the annotation collection ID, giving a > very simple way to fetch all annotations for a given object. > > I do want to investigate what an Annotate activity would look like, > but I suspect this would just be an announcement of sorts "hey, > there's this web annotation over here for this target" > > Yours, > Emelia > >> On 13 Jan 2025, at 04:23, Evan Prodromou <evan@prodromou.name> wrote: >> >> We don't have an easy way for remote actors to annotate content on >> the Fediverse. >> >> The biggest use case for this is to have permissionless fact-checking >> or community notes. A fact-checking service could annotate a remote >> content object like a Note or a Video with additional fact-checking >> information, and compliant clients or servers could show the >> fact-checking information when showing the Note. >> >> I think there are some tricky parts to this structure, which I >> believe suggests that we should start working on it. >> >> Evan >> >> >
Received on Thursday, 23 January 2025 15:39:10 UTC