Re: Fact-checking and community notes on the Fediverse from Benjamin Goering on 2025-01-22 (public-defacto@w3.org from January 2025)

From: Benjamin Goering <ben@bengo.co>
Date: Wed, 22 Jan 2025 11:00:01 -0800
To: Adam Sobieski <adamsobieski@hotmail.com>
Cc: "aaronngray@gmail.com" <aaronngray@gmail.com>, "Emelia S." <emelia@brandedcode.com>, Evan Prodromou <evan@prodromou.name>, "public-swicg@w3c.org" <public-swicg@w3c.org>, public-defacto@w3.org
Message-ID: <CAN+OhBNpbXw=P0r8ugozNE9=mqLfUjJosJ-bdQwCpOB4TxTHXg@mail.gmail.com>
I find it confusing how the `defacto` CG
<https://www.w3.org/groups/cg/defacto/> (mailing list cc'd) describes
itself here <https://www.w3.org/groups/cg/defacto/> and on LinkedIn as
`Decentralized Fact-checking & Provenance Working Group (DFCP)` and the
chairs (who are also founders of fact.technology) list 'W3C' and that
'Working Group' on their LinkedIns in a way that 1) makes it look like they
work for W3C and 2) makes it look like that community group is a W3C
Working Group, when it's not and 3) the use of 'Working Group' could imply
that CG is more legitimate than other CGs like https://credweb.org/

Heads up, because at the very least it is confusing and, iirc, there may be
some community guidelines about not representing a W3C CG as a WG, and not
using the W3C logo in a way that is misleading.

On Mon, Jan 20, 2025 at 5:46 AM Adam Sobieski <adamsobieski@hotmail.com>
wrote:

> Aaron,
> All,
>
> Should these broader topics interest you, there is a new *Decentralized
> Fact-checking & Provenance Organization (DeFacto) Community Group: *
> https://www.w3.org/community/defacto/ . The Chair of the new Group is
> interested in blockchain-based solutions [1].
>
>
> Best regards,
> Adam
>
> [1] https://fact.technology/
>
> ------------------------------
> *From:* Aaron Gray <aaronngray@gmail.com>
> *Sent:* Monday, January 20, 2025 12:39 AM
> *To:* Adam Sobieski <adamsobieski@hotmail.com>
> *Cc:* Emelia S. <emelia@brandedcode.com>; Evan Prodromou <
> evan@prodromou.name>; public-swicg@w3c.org <public-swicg@w3c.org>
> *Subject:* Re: Fact-checking and community notes on the Fediverse
>
>
> Adam,
>
> I am seriously of the opinion in any complex domain it takes an expert to
> make the real value judgements. Having said that detailed dissection of a
> text or text from audio and analysis by some form of grounding may allow
> analysis.
>
> But you have to remember everything is only a construct and even if all
> the facts are correct,  we have to remember that even science does not have
> facts, only theories that are checked against experiment. Given this,
> therefore what we are actually dealing with hypothetical constructs to
> burrow sciences way of analysing the world and applying that.
>
> This puts us in a situation where something might "have all the facts
> correct" but may not be correct in itself, it's a construct, and it may
> have been constructed to mislead or may be constructed by someone who is
> not aligned with reality or suffers from the alignment problem, to burrow
> from AI. Or they might quite simply not have all the facts.
>
> Now does the fact checker have all the facts, can we even check all the
> facts, and who delineates the truth in the end. If we claim the ultimate
> truth and we are not aligned with reality then we are only misleading.
>
> To reiterate, I am seriously of the opinion in any complex domain it takes
> an expert. And if an expert system like science and scientists make the
> wrong call, either because they are owned, it bought or influenced by
> politics or circumstance, then the whole system maybe devalued by the
> general public, who ever they are now
>
> I rest my case, this thing is really complicated and we need to tread
> carefully tools can be misused and are a double edged sword.
>
> Sorry I did not answer your question but stepped back a bit into science
> and the edge of philosophy, but I think we need to bear in mind the wider
> context before and as we step forward.
>
> Regards,
>
> Aaron
>
> On Mon, 20 Jan 2025, 02:15 Adam Sobieski, <adamsobieski@hotmail.com>
> wrote:
>
> Aaron,
>
> Yes, the pandemic did trigger much interest in fact-checking. I don't know
> whether interest is waning or not or, for that matter, in which situations
> that end-users would choose to make use any of these features that we're
> brainstorming and discussing.
>
> Beyond the pandemic and the related topics of the accuracy of information
> during crises and emergencies, interesting use cases include assuring the
> accuracy of public-sector speeches, debates, and meetings.
>
> Maybe, someday, there will be real-time fact-checking for orators'
> debates? Maybe, someday, legislators or their staffers will be able to make
> use of real-time fact-checking technologies using their smartphones?
>
> P2P-based approaches for annotations might answer some questions that were
> presented (searching for annotations) while creating yet more questions.
> For instance, with respect to fact-checking, I'm not yet sure about what
> the UX would be when a fact or claim were contested, when there were
> thousands of annotations supporting a fact or claim and thousands opposing
> it simultaneously. This might display, instead of a green checkmark or a
> red x, a yellow warning indicator. Mindful of the pandemic and the points
> that you raised, what sorts of dashboards can be envisiond for end-users to
> explore contested or disputed facts or claims?
>
> Meanwhile, the *Citation Needed* project [1] presents an entirely
> different approach to fact-checking, one involving AI and Wikipedia. Which
> kinds of responses should such a system provide to end-users, I wonder,
> when it can find content both supporting and opposing facts or claims on
> Wikipedia? This might segue from fact-checking to argumentation and to
> hedging, listing alternatives (e.g., true, false) and providing support for
> each alternative.
>
> Thank you. Any thoughts on these points?
>
>
> Best regards,
> Adam
>
> [1]
> https://meta.wikimedia.org/wiki/Future_Audiences/Experiment:Citation_Needed
>
> ------------------------------
> *From:* Aaron Gray <aaronngray@gmail.com>
> *Sent:* Sunday, January 19, 2025 6:57 PM
> *To:* Adam Sobieski <adamsobieski@hotmail.com>
> *Cc:* Emelia S. <emelia@brandedcode.com>; Evan Prodromou <
> evan@prodromou.name>; public-swicg@w3c.org <public-swicg@w3c.org>
> *Subject:* Re: Fact-checking and community notes on the Fediverse
>
> I think a lot of the issues we are dealing with need to be addressed with
> at source and are educational, social, political, nutritional, and drug
> related.
>
> Putting fact checking on things means :-
>
> a) your fact checking has to be correct, which often it's not.
> b) it has to be objective and not oppionated.
> c) it has to be well researched and well presented to _any_ audience.
> d) it has to be read, understood, and accepted.
>
> All of these are subject to cognitive biases. Wikipedia gives a good long
> list that all need to be considered :-
>
> https://en.m.wikipedia.org/wiki/List_of_cognitive_biases
>
> Quite frankly I think you are wasting your time most people don't read the
> stuff and it's got a reputation for being incorrect whether it is or not.
> So most of your target audience are either already educated and aware
> anyway or are not and just ignore it anyway. Most people on social media
> use emotions over intellect to judge things anyway and are subject to both
> confirmation bias and an echo chambered existence.
>
> The problems with COVID-19 for example were :-
> a) most people did not have sufficiently high enough levels of Vitamin D.
> b) the authorities wanted us to stay in and not get enough sunlight and
> fresh air
> c) most people drink milk and animal fats. Lactic and animal fats
> harbour Coronavirus.
> d) most people in ICU's had either  comorbidities, were overweight, or had
> genetic disposition with hACE2 receptors.
> e) were black or Hispanic nurses pushed to the attack surface in ICU's in
> hospitals on their feet for excessive periods dealing with COVID-19
> patients with airborne SARS-CoV-2 virii in close conditions with
> insufficient PPE.
> f) the people we were trying to protect were the elderly, people with
> comorbidities, people with immune conditions, or on immunosuppressants, or
> had genetic predispositions like the black population with hACE2 alleles.
> g) There are simple ways to help combat mRNA virii, like being young and
> having lots of siRNA's in your cell cytoplasm, having sex often and having
> lots of siRNA in your cellular cytoplasm, taking Vitamin C, D, Alpha Lipoic
> Acid and Quercetin if you have COVID-19.
>
> Now fact check that for example, you would not have found out this
> information without having run a COVID-19 group and/or read all the
> scientific literature on COVID-19 and SARS-CoV2. BTW this list is actually
> a lot lot longer but you get the idea. Now if you post that list you will
> get fact checked incorrectly despite it all being well researched mainly
> from PubMed accessible leading peer reviewed papers.
>
> This is what triggered all the fact checking in the first place.
>
> My 2 cents worth.
>
> Aaron
>
> On Tue, 14 Jan 2025, 23:32 Adam Sobieski, <adamsobieski@hotmail.com>
> wrote:
>
> Social Web Incubator Community Group,
>
> Hello. I am pleased to share some preliminary brainstorming and ideas
> about decentralized fact-checking and argumentation using P2P filesharing
> networks.
> Hopefully some of the following ideas can be of use for the Fediverse,
> e.g., for the discovery of existing annotations.
>
> Introduction With respect to sharing Web Annotations, uses of P2P
> networks have been previously explored (Segawa, 2006). Providing users with
> access to these kinds of networks from their Web browsers, today, is
> possible with WebRTC (Werner & Vogt, 2014; Ersson & Siri, 2015).
> P2P filesharing networks could be of use for decentralized fact-checking
> and argumentation. Facts or claims could be stored in entries, a special
> kind of file resource.
> By creating and sharing digitally-signed user feedback, notes, comments,
> or annotations with respect to those facts or claims in entries, users
> could express their determinations with respect to the veracity of facts or
> claims and could also present arguments for or against them (Bex, Snaith,
> Lawrence, & Reed, 2014).
> Entries could contain one or more references to paraphrases of content
> from locations on the Fediverse (see: Appendix A). Annotation objects from
> the Fediverse could be indexed and redundantly stored on P2P filesharing
> networks.
> Uses of Embedding Vectors
> Instead of, or in addition to, using cryptographic hashes to index and
> address content on P2P networks, digitally-signed entries for facts or
> claims could be indexed and addressed using embedding vectors (Zaarour &
> Curry, 2022).
> As considered, entries would be a special kind of file resource where
> their embedding vectors, embedding vectors verifiably for selections of
> other resources' contents, would be stored inside of them (see: Appendix A)
> rather than obtained from processing them with AI models.
> Indexing and addressing entries thusly would allow them to be merged or
> wrapped, e.g., to add paraphrases, digitally signing them at each step,
> without having to reindex them. Modifications, however, would result in
> changes to entries' cryptographic hashes.
> Deep learning can be used to detect and identify sentential paraphrases
> (Zhou, Qiu, Liang, & Acuna, 2022). More elaborate uses of language models
> could be utilized for inquiring and reasoning about whether sentences
> occurring in contexts were paraphrases.
> With respect to fact-checking on the Web, scenarios to consider include
> both fact-checking content which was expressly indicated to be a fact or
> claim by their authors, e.g., using custom elements, and fact-checking
> arbitrary selections of documents' content.
> Explorations with respect to fact-checking arbitrary selections of content
> include the open-source Citation Needed project by the Future Audiences
> team of the Wikimedia Foundation.
> The Prompt API
> Exploration is underway into providing APIs for accessing language models
> in Web browsers; the Web Machine Learning Working Group is developing the
> Prompt API.
> With access to language models in Web browsers, users might be able to
> obtain embedding vectors for portions of content in Web documents. These
> embedding vectors could be used to search for other content, e.g.,
> annotations, including on P2P networks.
> Custom Elements HTML5 custom elements could allow facts or claims to be
> expressed in documents, e.g., to add visual indictors near them or enable
> special context menus for them, while specifying values for embedding
> vectors computed for them using AI models (see: Appendix C). Appendices
> Appendix A shows a markup sketch for an entry, a created entry wrapped to
> add a paraphrase to it.
> Appendix B shows that embedding vectors could be added to Magnet URIs and
> Metalinks.
> Appendix C shows that HTML5 custom elements could be used for asserted
> facts or claims which refer to entries on P2P networks by means of one or
> more embedding vectors.
> Appendix D shows an approach involving shortcodes for authors using
> content-management systems to be able to easily add facts or claims to
> their content.
> Bibliography
> Bex, Floris, Mark Snaith, John Lawrence, and Chris Reed. "ArguBlogging: An
> application for the argument web." *Journal of Web Semantics* 25 (2014):
> 9-15. https://www.sciencedirect.com/science/article/pii/S1570826814000079
> Ersson, Kerstin, and Persson Siri. "Peer-to-peer distribution of web
> content using WebRTC within a web browser." (2015).
> https://www.diva-portal.org/smash/get/diva2:819420/FULLTEXT01.pdf
> Segawa, Osamu. "Web annotation sharing using P2P." In *Proceedings of the
> 15th international conference on World Wide Web*, pp. 851-852. 2006.
> http://ra.ethz.ch/CDstore/www2006/devel-www2006.ecs.soton.ac.uk/programme/files/pdf/p45.pdf
> Werner, Max Jonas, and Christian Vogt. "Implementation of a browser-based
> P2P network using WebRTC." *Hamburg* (2014).
> https://inet.haw-hamburg.de/teaching/ws-2013-14/master-project/Prj1-report-werner-vogt.pdf
> Zaarour, Tarek, and Edward Curry. "SemanticPeer: A distributional semantic
> peer-to-peer lookup protocol for large content spaces at internet-scale." *Future
> Generation Computer Systems* 132 (2022): 239-253.
> https://www.sciencedirect.com/science/article/pii/S0167739X22000590
> Zhou, Chao, Cheng Qiu, Lizhen Liang, and Daniel E. Acuna. "Paraphrase
> identification with deep learning: A review of datasets and methods." *arXiv
> preprint arXiv:2212.06933* (2022). https://arxiv.org/pdf/2212.06933
>
>
> Appendix A: Sketch of an Entry for a Fact or Claim
>
> <action kind="add-paraphrase">
>
>   <base>
>
>     <action kind="create">
>
>       <base />
>
>       <time>2024-01-14T00:01:00Z</time>
>
>       <v id="v-1" model=" urn:ai:model:llama:3.2:90B">...</v>
>
>       <metalink id="source-1">
>
>         <file name="article1.html">
>
>           <url>https://www.example1.com/user1/article1.html</url>
>
>         </file>
>
>       </metalink>
>
>       <selection source="source-1">
>
>         ... <select v="v-1">A sentence.</select> ...
>
>       </selection>
>
>       <signature>...</signature>
>
>     </action>
>
>   </base>
>
>   <time>2024-01-14T00:00:00Z</time>
>
>   <v id="v-2" model="urn:ai:model:llama:3.3:70B">...</v>
>
>   <metalink id="source-2">
>
>     <file name="article2.html">
>
>       <url>https://www.example2.com/user2/article2.html</url>
>
>     </file>
>
>   </metalink>
>
>   <selection source="source-2">
>
>     ... <select v="v-1 v-2">A paraphrase.</select> ...
>
>   </selection>
>
>   <signature>...</signature>
>
> </action>
>
>
> Appendix B: Adding Embedding Vectors to Magnet URIs and Metalinks Embedding
> vectors could be added to Magnet URIs by means of adding a key: xv.
> Embedding vectors could be new components of metalinks.
> <metalink xmlns="urn:ietf:params:xml:ns:metalink">
>   <published>2009-05-15T12:23:23Z</published>
>   <file name="example.txt">
>     <url>http://www.example.com/example.txt</url>
>     <vector model="urn:ai:model:llama:3.3:70B">...</vector>
>   </file>
> </metalink>
>
> Appendix C: Custom Elements for Facts or Claims A custom element could be
> used to signify an asserted fact or claim, referring to an entry on a P2P
> network by means of embedding vectors alongside other information. Via a
> JavaScript library, and perhaps WebRTC, clients could participate in P2P
> networks and retrieve entries, feedback on entries, or both.
> Notice that, for the special file type of entries, those embedding vectors
> within them and not of the XML file itself are utilized with respect to
> storing and addressing the resource on P2P networks.
> <verifiable-claim see="magnet:?xv=...">Ut enim ad minim veniam, quis
> nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo
> consequat.</verifiable-claim>
> Appendix D: Content Authoring with Shortcodes How might authors easily
> add facts or claims to their content? With respect to popular
> content-management systems, the syntax for so doing could resemble that of
> existing shortcodes like [quote].
> [claim]Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
> nisi ut aliquip ex ea commodo consequat.[/claim]
> During content-publishing processes, authors' content-management systems
> (e.g., Drupal, WordPress) – or configurable plugins or extensions for these
> systems – could handle searching for existing paraphrases, adding new facts
> or claims (if needed) to P2P filesharing networks, obtaining the data for
> use in the see attributes, caching these data, and generating markup.
>
> ------------------------------
> *From:* Emelia S. <emelia@brandedcode.com>
> *Sent:* Monday, January 13, 2025 11:21 AM
> *To:* Evan Prodromou <evan@prodromou.name>
> *Cc:* public-swicg@w3c.org <public-swicg@w3c.org>
> *Subject:* Re: Fact-checking and community notes on the Fediverse
>
> This is already something on the list of things that the ActivityPub Trust
>  & Safety Taskforce is working on:
>
> [image: 4.png]
> <https://github.com/swicg/activitypub-trust-and-safety/issues/4>
>
> Idea: Annotations / Labeling of content · Issue #4 ·
> swicg/activitypub-trust-and-safety
> <https://github.com/swicg/activitypub-trust-and-safety/issues/4>
> github.com
> <https://github.com/swicg/activitypub-trust-and-safety/issues/4>
>
> The Web Annotations model could work, but the discovery of annotations
> that exist is the hardest part, I've started solving that in
> https://github.com/ThisIsMissEm/annotations-service where I use the
> sha256 hash of the Object ID as the annotation collection ID, giving a very
> simple way to fetch all annotations for a given object.
>
> I do want to investigate what an Annotate activity would look like, but I
> suspect this would just be an announcement of sorts "hey, there's this web
> annotation over here for this target"
>
> Yours,
> Emelia
>
> On 13 Jan 2025, at 04:23, Evan Prodromou <evan@prodromou.name> wrote:
>
> We don't have an easy way for remote actors to annotate content on the
> Fediverse.
>
> The biggest use case for this is to have permissionless fact-checking or
> community notes. A fact-checking service could annotate a remote content
> object like a Note or a Video with additional fact-checking information,
> and compliant clients or servers could show the fact-checking information
> when showing the Note.
>
> I think there are some tricky parts to this structure, which I believe
> suggests that we should start working on it.
>
> Evan
>
>
>
>
Received on Wednesday, 22 January 2025 19:00:18 UTC