- From: Benjamin Goering <ben@bengo.co>
- Date: Wed, 22 Jan 2025 11:00:01 -0800
- To: Adam Sobieski <adamsobieski@hotmail.com>
- Cc: "aaronngray@gmail.com" <aaronngray@gmail.com>, "Emelia S." <emelia@brandedcode.com>, Evan Prodromou <evan@prodromou.name>, "public-swicg@w3c.org" <public-swicg@w3c.org>, public-defacto@w3.org
- Message-ID: <CAN+OhBNpbXw=P0r8ugozNE9=mqLfUjJosJ-bdQwCpOB4TxTHXg@mail.gmail.com>
I find it confusing how the `defacto` CG <https://www.w3.org/groups/cg/defacto/> (mailing list cc'd) describes itself here <https://www.w3.org/groups/cg/defacto/> and on LinkedIn as `Decentralized Fact-checking & Provenance Working Group (DFCP)` and the chairs (who are also founders of fact.technology) list 'W3C' and that 'Working Group' on their LinkedIns in a way that 1) makes it look like they work for W3C and 2) makes it look like that community group is a W3C Working Group, when it's not and 3) the use of 'Working Group' could imply that CG is more legitimate than other CGs like https://credweb.org/ Heads up, because at the very least it is confusing and, iirc, there may be some community guidelines about not representing a W3C CG as a WG, and not using the W3C logo in a way that is misleading. On Mon, Jan 20, 2025 at 5:46 AM Adam Sobieski <adamsobieski@hotmail.com> wrote: > Aaron, > All, > > Should these broader topics interest you, there is a new *Decentralized > Fact-checking & Provenance Organization (DeFacto) Community Group: * > https://www.w3.org/community/defacto/ . The Chair of the new Group is > interested in blockchain-based solutions [1]. > > > Best regards, > Adam > > [1] https://fact.technology/ > > ------------------------------ > *From:* Aaron Gray <aaronngray@gmail.com> > *Sent:* Monday, January 20, 2025 12:39 AM > *To:* Adam Sobieski <adamsobieski@hotmail.com> > *Cc:* Emelia S. <emelia@brandedcode.com>; Evan Prodromou < > evan@prodromou.name>; public-swicg@w3c.org <public-swicg@w3c.org> > *Subject:* Re: Fact-checking and community notes on the Fediverse > > > Adam, > > I am seriously of the opinion in any complex domain it takes an expert to > make the real value judgements. Having said that detailed dissection of a > text or text from audio and analysis by some form of grounding may allow > analysis. > > But you have to remember everything is only a construct and even if all > the facts are correct, we have to remember that even science does not have > facts, only theories that are checked against experiment. Given this, > therefore what we are actually dealing with hypothetical constructs to > burrow sciences way of analysing the world and applying that. > > This puts us in a situation where something might "have all the facts > correct" but may not be correct in itself, it's a construct, and it may > have been constructed to mislead or may be constructed by someone who is > not aligned with reality or suffers from the alignment problem, to burrow > from AI. Or they might quite simply not have all the facts. > > Now does the fact checker have all the facts, can we even check all the > facts, and who delineates the truth in the end. If we claim the ultimate > truth and we are not aligned with reality then we are only misleading. > > To reiterate, I am seriously of the opinion in any complex domain it takes > an expert. And if an expert system like science and scientists make the > wrong call, either because they are owned, it bought or influenced by > politics or circumstance, then the whole system maybe devalued by the > general public, who ever they are now > > I rest my case, this thing is really complicated and we need to tread > carefully tools can be misused and are a double edged sword. > > Sorry I did not answer your question but stepped back a bit into science > and the edge of philosophy, but I think we need to bear in mind the wider > context before and as we step forward. > > Regards, > > Aaron > > On Mon, 20 Jan 2025, 02:15 Adam Sobieski, <adamsobieski@hotmail.com> > wrote: > > Aaron, > > Yes, the pandemic did trigger much interest in fact-checking. I don't know > whether interest is waning or not or, for that matter, in which situations > that end-users would choose to make use any of these features that we're > brainstorming and discussing. > > Beyond the pandemic and the related topics of the accuracy of information > during crises and emergencies, interesting use cases include assuring the > accuracy of public-sector speeches, debates, and meetings. > > Maybe, someday, there will be real-time fact-checking for orators' > debates? Maybe, someday, legislators or their staffers will be able to make > use of real-time fact-checking technologies using their smartphones? > > P2P-based approaches for annotations might answer some questions that were > presented (searching for annotations) while creating yet more questions. > For instance, with respect to fact-checking, I'm not yet sure about what > the UX would be when a fact or claim were contested, when there were > thousands of annotations supporting a fact or claim and thousands opposing > it simultaneously. This might display, instead of a green checkmark or a > red x, a yellow warning indicator. Mindful of the pandemic and the points > that you raised, what sorts of dashboards can be envisiond for end-users to > explore contested or disputed facts or claims? > > Meanwhile, the *Citation Needed* project [1] presents an entirely > different approach to fact-checking, one involving AI and Wikipedia. Which > kinds of responses should such a system provide to end-users, I wonder, > when it can find content both supporting and opposing facts or claims on > Wikipedia? This might segue from fact-checking to argumentation and to > hedging, listing alternatives (e.g., true, false) and providing support for > each alternative. > > Thank you. Any thoughts on these points? > > > Best regards, > Adam > > [1] > https://meta.wikimedia.org/wiki/Future_Audiences/Experiment:Citation_Needed > > ------------------------------ > *From:* Aaron Gray <aaronngray@gmail.com> > *Sent:* Sunday, January 19, 2025 6:57 PM > *To:* Adam Sobieski <adamsobieski@hotmail.com> > *Cc:* Emelia S. <emelia@brandedcode.com>; Evan Prodromou < > evan@prodromou.name>; public-swicg@w3c.org <public-swicg@w3c.org> > *Subject:* Re: Fact-checking and community notes on the Fediverse > > I think a lot of the issues we are dealing with need to be addressed with > at source and are educational, social, political, nutritional, and drug > related. > > Putting fact checking on things means :- > > a) your fact checking has to be correct, which often it's not. > b) it has to be objective and not oppionated. > c) it has to be well researched and well presented to _any_ audience. > d) it has to be read, understood, and accepted. > > All of these are subject to cognitive biases. Wikipedia gives a good long > list that all need to be considered :- > > https://en.m.wikipedia.org/wiki/List_of_cognitive_biases > > Quite frankly I think you are wasting your time most people don't read the > stuff and it's got a reputation for being incorrect whether it is or not. > So most of your target audience are either already educated and aware > anyway or are not and just ignore it anyway. Most people on social media > use emotions over intellect to judge things anyway and are subject to both > confirmation bias and an echo chambered existence. > > The problems with COVID-19 for example were :- > a) most people did not have sufficiently high enough levels of Vitamin D. > b) the authorities wanted us to stay in and not get enough sunlight and > fresh air > c) most people drink milk and animal fats. Lactic and animal fats > harbour Coronavirus. > d) most people in ICU's had either comorbidities, were overweight, or had > genetic disposition with hACE2 receptors. > e) were black or Hispanic nurses pushed to the attack surface in ICU's in > hospitals on their feet for excessive periods dealing with COVID-19 > patients with airborne SARS-CoV-2 virii in close conditions with > insufficient PPE. > f) the people we were trying to protect were the elderly, people with > comorbidities, people with immune conditions, or on immunosuppressants, or > had genetic predispositions like the black population with hACE2 alleles. > g) There are simple ways to help combat mRNA virii, like being young and > having lots of siRNA's in your cell cytoplasm, having sex often and having > lots of siRNA in your cellular cytoplasm, taking Vitamin C, D, Alpha Lipoic > Acid and Quercetin if you have COVID-19. > > Now fact check that for example, you would not have found out this > information without having run a COVID-19 group and/or read all the > scientific literature on COVID-19 and SARS-CoV2. BTW this list is actually > a lot lot longer but you get the idea. Now if you post that list you will > get fact checked incorrectly despite it all being well researched mainly > from PubMed accessible leading peer reviewed papers. > > This is what triggered all the fact checking in the first place. > > My 2 cents worth. > > Aaron > > On Tue, 14 Jan 2025, 23:32 Adam Sobieski, <adamsobieski@hotmail.com> > wrote: > > Social Web Incubator Community Group, > > Hello. I am pleased to share some preliminary brainstorming and ideas > about decentralized fact-checking and argumentation using P2P filesharing > networks. > Hopefully some of the following ideas can be of use for the Fediverse, > e.g., for the discovery of existing annotations. > > Introduction With respect to sharing Web Annotations, uses of P2P > networks have been previously explored (Segawa, 2006). Providing users with > access to these kinds of networks from their Web browsers, today, is > possible with WebRTC (Werner & Vogt, 2014; Ersson & Siri, 2015). > P2P filesharing networks could be of use for decentralized fact-checking > and argumentation. Facts or claims could be stored in entries, a special > kind of file resource. > By creating and sharing digitally-signed user feedback, notes, comments, > or annotations with respect to those facts or claims in entries, users > could express their determinations with respect to the veracity of facts or > claims and could also present arguments for or against them (Bex, Snaith, > Lawrence, & Reed, 2014). > Entries could contain one or more references to paraphrases of content > from locations on the Fediverse (see: Appendix A). Annotation objects from > the Fediverse could be indexed and redundantly stored on P2P filesharing > networks. > Uses of Embedding Vectors > Instead of, or in addition to, using cryptographic hashes to index and > address content on P2P networks, digitally-signed entries for facts or > claims could be indexed and addressed using embedding vectors (Zaarour & > Curry, 2022). > As considered, entries would be a special kind of file resource where > their embedding vectors, embedding vectors verifiably for selections of > other resources' contents, would be stored inside of them (see: Appendix A) > rather than obtained from processing them with AI models. > Indexing and addressing entries thusly would allow them to be merged or > wrapped, e.g., to add paraphrases, digitally signing them at each step, > without having to reindex them. Modifications, however, would result in > changes to entries' cryptographic hashes. > Deep learning can be used to detect and identify sentential paraphrases > (Zhou, Qiu, Liang, & Acuna, 2022). More elaborate uses of language models > could be utilized for inquiring and reasoning about whether sentences > occurring in contexts were paraphrases. > With respect to fact-checking on the Web, scenarios to consider include > both fact-checking content which was expressly indicated to be a fact or > claim by their authors, e.g., using custom elements, and fact-checking > arbitrary selections of documents' content. > Explorations with respect to fact-checking arbitrary selections of content > include the open-source Citation Needed project by the Future Audiences > team of the Wikimedia Foundation. > The Prompt API > Exploration is underway into providing APIs for accessing language models > in Web browsers; the Web Machine Learning Working Group is developing the > Prompt API. > With access to language models in Web browsers, users might be able to > obtain embedding vectors for portions of content in Web documents. These > embedding vectors could be used to search for other content, e.g., > annotations, including on P2P networks. > Custom Elements HTML5 custom elements could allow facts or claims to be > expressed in documents, e.g., to add visual indictors near them or enable > special context menus for them, while specifying values for embedding > vectors computed for them using AI models (see: Appendix C). Appendices > Appendix A shows a markup sketch for an entry, a created entry wrapped to > add a paraphrase to it. > Appendix B shows that embedding vectors could be added to Magnet URIs and > Metalinks. > Appendix C shows that HTML5 custom elements could be used for asserted > facts or claims which refer to entries on P2P networks by means of one or > more embedding vectors. > Appendix D shows an approach involving shortcodes for authors using > content-management systems to be able to easily add facts or claims to > their content. > Bibliography > Bex, Floris, Mark Snaith, John Lawrence, and Chris Reed. "ArguBlogging: An > application for the argument web." *Journal of Web Semantics* 25 (2014): > 9-15. https://www.sciencedirect.com/science/article/pii/S1570826814000079 > Ersson, Kerstin, and Persson Siri. "Peer-to-peer distribution of web > content using WebRTC within a web browser." (2015). > https://www.diva-portal.org/smash/get/diva2:819420/FULLTEXT01.pdf > Segawa, Osamu. "Web annotation sharing using P2P." In *Proceedings of the > 15th international conference on World Wide Web*, pp. 851-852. 2006. > http://ra.ethz.ch/CDstore/www2006/devel-www2006.ecs.soton.ac.uk/programme/files/pdf/p45.pdf > Werner, Max Jonas, and Christian Vogt. "Implementation of a browser-based > P2P network using WebRTC." *Hamburg* (2014). > https://inet.haw-hamburg.de/teaching/ws-2013-14/master-project/Prj1-report-werner-vogt.pdf > Zaarour, Tarek, and Edward Curry. "SemanticPeer: A distributional semantic > peer-to-peer lookup protocol for large content spaces at internet-scale." *Future > Generation Computer Systems* 132 (2022): 239-253. > https://www.sciencedirect.com/science/article/pii/S0167739X22000590 > Zhou, Chao, Cheng Qiu, Lizhen Liang, and Daniel E. Acuna. "Paraphrase > identification with deep learning: A review of datasets and methods." *arXiv > preprint arXiv:2212.06933* (2022). https://arxiv.org/pdf/2212.06933 > > > Appendix A: Sketch of an Entry for a Fact or Claim > > <action kind="add-paraphrase"> > > <base> > > <action kind="create"> > > <base /> > > <time>2024-01-14T00:01:00Z</time> > > <v id="v-1" model=" urn:ai:model:llama:3.2:90B">...</v> > > <metalink id="source-1"> > > <file name="article1.html"> > > <url>https://www.example1.com/user1/article1.html</url> > > </file> > > </metalink> > > <selection source="source-1"> > > ... <select v="v-1">A sentence.</select> ... > > </selection> > > <signature>...</signature> > > </action> > > </base> > > <time>2024-01-14T00:00:00Z</time> > > <v id="v-2" model="urn:ai:model:llama:3.3:70B">...</v> > > <metalink id="source-2"> > > <file name="article2.html"> > > <url>https://www.example2.com/user2/article2.html</url> > > </file> > > </metalink> > > <selection source="source-2"> > > ... <select v="v-1 v-2">A paraphrase.</select> ... > > </selection> > > <signature>...</signature> > > </action> > > > Appendix B: Adding Embedding Vectors to Magnet URIs and Metalinks Embedding > vectors could be added to Magnet URIs by means of adding a key: xv. > Embedding vectors could be new components of metalinks. > <metalink xmlns="urn:ietf:params:xml:ns:metalink"> > <published>2009-05-15T12:23:23Z</published> > <file name="example.txt"> > <url>http://www.example.com/example.txt</url> > <vector model="urn:ai:model:llama:3.3:70B">...</vector> > </file> > </metalink> > > Appendix C: Custom Elements for Facts or Claims A custom element could be > used to signify an asserted fact or claim, referring to an entry on a P2P > network by means of embedding vectors alongside other information. Via a > JavaScript library, and perhaps WebRTC, clients could participate in P2P > networks and retrieve entries, feedback on entries, or both. > Notice that, for the special file type of entries, those embedding vectors > within them and not of the XML file itself are utilized with respect to > storing and addressing the resource on P2P networks. > <verifiable-claim see="magnet:?xv=...">Ut enim ad minim veniam, quis > nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo > consequat.</verifiable-claim> > Appendix D: Content Authoring with Shortcodes How might authors easily > add facts or claims to their content? With respect to popular > content-management systems, the syntax for so doing could resemble that of > existing shortcodes like [quote]. > [claim]Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris > nisi ut aliquip ex ea commodo consequat.[/claim] > During content-publishing processes, authors' content-management systems > (e.g., Drupal, WordPress) – or configurable plugins or extensions for these > systems – could handle searching for existing paraphrases, adding new facts > or claims (if needed) to P2P filesharing networks, obtaining the data for > use in the see attributes, caching these data, and generating markup. > > ------------------------------ > *From:* Emelia S. <emelia@brandedcode.com> > *Sent:* Monday, January 13, 2025 11:21 AM > *To:* Evan Prodromou <evan@prodromou.name> > *Cc:* public-swicg@w3c.org <public-swicg@w3c.org> > *Subject:* Re: Fact-checking and community notes on the Fediverse > > This is already something on the list of things that the ActivityPub Trust > & Safety Taskforce is working on: > > [image: 4.png] > <https://github.com/swicg/activitypub-trust-and-safety/issues/4> > > Idea: Annotations / Labeling of content · Issue #4 · > swicg/activitypub-trust-and-safety > <https://github.com/swicg/activitypub-trust-and-safety/issues/4> > github.com > <https://github.com/swicg/activitypub-trust-and-safety/issues/4> > > The Web Annotations model could work, but the discovery of annotations > that exist is the hardest part, I've started solving that in > https://github.com/ThisIsMissEm/annotations-service where I use the > sha256 hash of the Object ID as the annotation collection ID, giving a very > simple way to fetch all annotations for a given object. > > I do want to investigate what an Annotate activity would look like, but I > suspect this would just be an announcement of sorts "hey, there's this web > annotation over here for this target" > > Yours, > Emelia > > On 13 Jan 2025, at 04:23, Evan Prodromou <evan@prodromou.name> wrote: > > We don't have an easy way for remote actors to annotate content on the > Fediverse. > > The biggest use case for this is to have permissionless fact-checking or > community notes. A fact-checking service could annotate a remote content > object like a Note or a Video with additional fact-checking information, > and compliant clients or servers could show the fact-checking information > when showing the Note. > > I think there are some tricky parts to this structure, which I believe > suggests that we should start working on it. > > Evan > > > >
Received on Wednesday, 22 January 2025 19:00:18 UTC