Re: Fact-checking and community notes on the Fediverse

Hey, Emelia. What's the best way for people on this list to participate 
in that discussion?

And are you OK with people starting that work now, or do they have to 
wait until the Initial Report of the task force is done?

Evan

On 2025-01-23 3:27 p.m., Emelia S. wrote:
> Once again, I do want to state that this is being actively working on 
> by the ActivityPub Trust & Safety Taskforce. I'd encourage folks to 
> work with us on this, instead of duplicating efforts.
>
> 41.png
> Workstream: Content Warnings, Labels and Annotations · Issue #41 · 
> swicg/activitypub-trust-and-safety 
> <https://github.com/swicg/activitypub-trust-and-safety/issues/41>
> github.com 
> <https://github.com/swicg/activitypub-trust-and-safety/issues/41>
>
> <https://github.com/swicg/activitypub-trust-and-safety/issues/41>
>
>
> The goal here for Annotations is likely to reuse the Web Annotations 
> Protocol, but also define some Activities that can be used to 
> distribute those Web Annotations over ActivityPub. We don't need to 
> reinvent the wheel here.
>
> Yours,
> Emelia
>
>> On 23 Jan 2025, at 20:32, Evan Prodromou <evan@prodromou.name> wrote:
>>
>> One possible interaction flow is this.
>>
>> Let's suppose an actor distributes the following activity:
>>
>> {
>>     "@context":"https://www.w3.org/ns/activitystreams",
>>     "id":"https://social.example/user/17/create/337",
>>     "actor":"https://social.example/user/17",
>>     "type": "Create",
>>     "to": "as:Public",
>>     "object": {
>>        "id":"https://social.example/user/17/note/1221",
>>        "type": "Note",
>>        "to": "as:Public",
>>        "attributedTo":"https://social.example/user/17",
>>        "content": "<p>Large Marge had died that very night ten years 
>> before!</p>",
>>        "published": "2025-01-23T20:15:00Z"
>>     },
>>     "published": "2025-01-23T20:15:00Z"
>> }
>>
>> This creates a public note (or short text) with some questionable 
>> facts included.
>>
>> Another actor could publish an annotation on that object, to give 
>> further context for readers:
>>
>> {
>>     "@context": [
>> "https://www.w3.org/ns/activitystreams",
>> "https://annotate.example/ns"
>>     ],
>>     "id":"https://factcheck.example/user/338/annotate/47",
>>     "actor": "https://factcheck.example/user/338",
>>     "type": "Annotate",
>>     "to": "as:Public",
>>     "object": {
>>        "id":"https://factcheck.example/user/338/annotate/47",
>>        "type": "Note",
>>        "to": "as:Public",
>>        "attributedTo":"https://factcheck.example/user/338",
>>        "content": "<p>This is a variation of the <a 
>> href='https://w.wiki/CpRq'>vanishing hitchhiker</a> urban legend.</p>"
>>        "published": "2025-01-23T20:45:00Z"
>>     },
>>     "target": "https://social.example/user/17/note/1221",
>>     "published": "2025-01-23T20:45:00Z"
>> }
>>
>> "Annotate" is not a standard activity type in ActivityPub; I added a 
>> fictional "annotations" context document here.
>>
>> Servers that receive this annotation might include it when they 
>> redistribute the Note object to ActivityPub clients:
>>
>> {
>>     "id":"https://social.example/user/17/note/1221",
>>     "type": "Note",
>>     "to": "as:Public",
>>     "attributedTo":"https://social.example/user/17",
>>     "content": "<p>Large Marge had died that very night ten years 
>> before!</p>",
>>     "published": "2025-01-23T20:15:00Z",
>>     "annotations": {
>>          "type": "Collection",
>>          "id": 
>> "https://other.example/system/annotations/social.example/user/17/note/1221",
>>          "to": "as:Public",
>>          "items":{
>>             "id":"https://factcheck.example/user/338/annotate/47",
>>             "type": "Note",
>>             "to": "as:Public",
>>             "attributedTo":"https://factcheck.example/user/338",
>>              "content": "<p>This is a variation of the <a 
>> href='https://w.wiki/CpRq'>vanishing hitchhiker</a> urban legend.</p>"
>>             "published": "2025-01-23T20:45:00Z"
>>          }
>>     }
>> }
>>
>> This is actually kind of a tricky situation, since usually the 
>> properties of the object as defined by the sending server, and 
>> available by fetching 
>> `https://social.example/user/17/note/1221`,would be considered 
>> canonical. The `annotations` property is managed by a different 
>> server, without the control or even knowledge of the original actor 
>> or their service.
>>
>> The annotations here are public; within the AP authorization model, 
>> it's also possible to restrict distribution and access to the 
>> annotations (with a different "to" property).
>>
>> I think the work needed here would be as follows:
>>
>> - Define a context doc for `Annotate` and `annotations`
>> - A FEP or another document describing how these can be used
>>
>> Obviously, this is just the protocol layer; it doesn't even begin to 
>> explore the options for actually setting up a network of fact 
>> checkers or establishing trust in those fact checkers.
>>
>> Evan
>>
>> On 2025-01-23 11:33 a.m., Adam Sobieski wrote:
>>> Evan,
>>>
>>> Ok. I will take a look at ActivityPub server-to-server interactions 
>>> and think about methods where fact-checking information, e.g., 
>>> annotations or community notes, are distributed via ActivityPub.
>>>
>>>
>>> Best regards,
>>> Adam
>>>
>>> ------------------------------------------------------------------------
>>> *From:* Evan Prodromou<evan@prodromou.name>
>>> *Sent:* Thursday, January 23, 2025 10:38 AM
>>> *To:* Adam Sobieski<adamsobieski@hotmail.com>; Emelia 
>>> S.<emelia@brandedcode.com>
>>> *Cc:* public-swicg@w3c.org<public-swicg@w3c.org>
>>> *Subject:* Re: Fact-checking and community notes on the Fediverse
>>> Hey, Adam. So, I'd prefer to use methods where the fact-checking 
>>> information is distributed via ActivityPub.
>>>
>>> Evan
>>>
>>> On 2025-01-14 6:31 p.m., Adam Sobieski wrote:
>>>
>>>     Social Web Incubator Community Group,
>>>
>>>     Hello. I am pleased to share some preliminary brainstorming and
>>>     ideas about decentralized fact-checking and argumentation using
>>>     P2P filesharing networks.
>>>     Hopefully some of the following ideas can be of use for the
>>>     Fediverse, e.g., for the discovery of existing annotations.
>>>
>>>
>>>       Introduction
>>>
>>>
>>>       With respect to sharing Web Annotations, uses of P2P networks
>>>       have been previously explored (Segawa, 2006). Providing users
>>>       with access to these kinds of networks from their Web
>>>       browsers, today, is possible with WebRTC (Werner & Vogt, 2014;
>>>       Ersson & Siri, 2015).
>>>
>>>     P2P filesharing networks could be of use for decentralized
>>>     fact-checking and argumentation. Facts or claims could be stored
>>>     in entries, a special kind of file resource.
>>>     By creating and sharing digitally-signed user feedback, notes,
>>>     comments, or annotations with respect to those facts or claims
>>>     in entries, users could express their determinations with
>>>     respect to the veracity of facts or claims and could also
>>>     present arguments for or against them (Bex, Snaith, Lawrence, &
>>>     Reed, 2014).
>>>     Entries could contain one or more references to paraphrases of
>>>     content from locations on the Fediverse (see: Appendix A).
>>>     Annotation objects from the Fediverse could be indexed and
>>>     redundantly stored on P2P filesharing networks.
>>>
>>>
>>>       Uses of Embedding Vectors
>>>
>>>     Instead of, or in addition to, using cryptographic hashes to
>>>     index and address content on P2P networks, digitally-signed
>>>     entries for facts or claims could be indexed and addressed using
>>>     embedding vectors (Zaarour & Curry, 2022).
>>>     As considered, entries would be a special kind of file resource
>>>     where their embedding vectors, embedding vectors verifiably for
>>>     selections of other resources' contents, would be stored inside
>>>     of them (see: Appendix A) rather than obtained from processing
>>>     them with AI models.
>>>     Indexing and addressing entries thusly would allow them to be
>>>     merged or wrapped, e.g., to add paraphrases, digitally signing
>>>     them at each step, without having to reindex them.
>>>     Modifications, however, would result in changes to entries'
>>>     cryptographic hashes.
>>>     Deep learning can be used to detect and identify sentential
>>>     paraphrases (Zhou, Qiu, Liang, & Acuna, 2022). More elaborate
>>>     uses of language models could be utilized for inquiring and
>>>     reasoning about whether sentences occurring in contexts were
>>>     paraphrases.
>>>     With respect to fact-checking on the Web, scenarios to consider
>>>     include both fact-checking content which was expressly indicated
>>>     to be a fact or claim by their authors, e.g., using custom
>>>     elements, and fact-checking arbitrary selections of documents'
>>>     content.
>>>     Explorations with respect to fact-checking arbitrary selections
>>>     of content include the open-source Citation Needed project by
>>>     the Future Audiences team of the Wikimedia Foundation.
>>>
>>>
>>>       The Prompt API
>>>
>>>     Exploration is underway into providing APIs for accessing
>>>     language models in Web browsers; the Web Machine Learning
>>>     Working Group is developing the Prompt API.
>>>     With access to language models in Web browsers, users might be
>>>     able to obtain embedding vectors for portions of content in Web
>>>     documents. These embedding vectors could be used to search for
>>>     other content, e.g., annotations, including on P2P networks.
>>>
>>>
>>>       Custom Elements
>>>
>>>
>>>       HTML5 custom elements could allow facts or claims to be
>>>       expressed in documents, e.g., to add visual indictors near
>>>       them or enable special context menus for them, while
>>>       specifying values for embedding vectors computed for them
>>>       using AI models (see: Appendix C).
>>>
>>>
>>>       Appendices
>>>
>>>     Appendix A shows a markup sketch for an entry, a created entry
>>>     wrapped to add a paraphrase to it.
>>>     Appendix B shows that embedding vectors could be added to Magnet
>>>     URIs and Metalinks.
>>>     Appendix C shows that HTML5 custom elements could be used for
>>>     asserted facts or claims which refer to entries on P2P networks
>>>     by means of one or more embedding vectors.
>>>     Appendix D shows an approach involving shortcodes for authors
>>>     using content-management systems to be able to easily add facts
>>>     or claims to their content.
>>>
>>>
>>>       Bibliography
>>>
>>>     Bex, Floris, Mark Snaith, John Lawrence, and Chris Reed.
>>>     "ArguBlogging: An application for the argument web."/Journal of
>>>     Web Semantics/ 25 (2014):
>>>     9-15.https://www.sciencedirect.com/science/article/pii/S1570826814000079
>>>     <https://www.sciencedirect.com/science/article/pii/S1570826814000079>
>>>     Ersson, Kerstin, and Persson Siri. "Peer-to-peer distribution of
>>>     web content using WebRTC within a web browser."
>>>     (2015).https://www.diva-portal.org/smash/get/diva2:819420/FULLTEXT01.pdf
>>>     <https://www.diva-portal.org/smash/get/diva2:819420/FULLTEXT01.pdf>
>>>     Segawa, Osamu. "Web annotation sharing using P2P."
>>>     In/Proceedings of the 15th international conference on World
>>>     Wide Web/, pp. 851-852.
>>>     2006.http://ra.ethz.ch/CDstore/www2006/devel-www2006.ecs.soton.ac.uk/programme/files/pdf/p45.pdf
>>>     <http://ra.ethz.ch/CDstore/www2006/devel-www2006.ecs.soton.ac.uk/programme/files/pdf/p45.pdf>
>>>     Werner, Max Jonas, and Christian Vogt. "Implementation of a
>>>     browser-based P2P network using
>>>     WebRTC."/Hamburg/ (2014).https://inet.haw-hamburg.de/teaching/ws-2013-14/master-project/Prj1-report-werner-vogt.pdf
>>>     <https://inet.haw-hamburg.de/teaching/ws-2013-14/master-project/Prj1-report-werner-vogt.pdf>
>>>     Zaarour, Tarek, and Edward Curry. "SemanticPeer: A
>>>     distributional semantic peer-to-peer lookup protocol for large
>>>     content spaces at internet-scale."/Future Generation Computer
>>>     Systems/ 132 (2022):
>>>     239-253.https://www.sciencedirect.com/science/article/pii/S0167739X22000590
>>>     <https://www.sciencedirect.com/science/article/pii/S0167739X22000590>
>>>
>>>     Zhou, Chao, Cheng Qiu, Lizhen Liang, and Daniel E. Acuna.
>>>     "Paraphrase identification with deep learning: A review of
>>>     datasets and methods."/arXiv preprint
>>>     arXiv:2212.06933/ (2022).https://arxiv.org/pdf/2212.06933
>>>     <https://arxiv.org/pdf/2212.06933>
>>>
>>>
>>>       Appendix A: Sketch of an Entry for a Fact or Claim
>>>
>>>     <action kind="add-paraphrase">
>>>     <base>
>>>     <action kind="create">
>>>     <base />
>>>     <time>2024-01-14T00:01:00Z</time>
>>>     <v id="v-1" model=" urn:ai:model:llama:3.2:90B">...</v>
>>>     <metalink id="source-1">
>>>     <file name="article1.html">
>>>     <url>https://www.example1.com/user1/article1.html
>>>     <https://www.example1.com/user1/article1.html></url>
>>>     </file>
>>>     </metalink>
>>>     <selection source="source-1">
>>>     ... <select v="v-1">A sentence.</select> ...
>>>     </selection>
>>>     <signature>...</signature>
>>>     </action>
>>>     </base>
>>>     <time>2024-01-14T00:00:00Z</time>
>>>     <v id="v-2" model="urn:ai:model:llama:3.3:70B">...</v>
>>>     <metalink id="source-2">
>>>     <file name="article2.html">
>>>     <url>https://www.example2.com/user2/article2.html
>>>     <https://www.example2.com/user2/article2.html></url>
>>>     </file>
>>>     </metalink>
>>>     <selection source="source-2">
>>>     ... <select v="v-1 v-2">A paraphrase.</select> ...
>>>     </selection>
>>>     <signature>...</signature>
>>>     </action>
>>>
>>>
>>>       Appendix B: Adding Embedding Vectors to Magnet URIs and Metalinks
>>>
>>>
>>>       Embedding vectors could be added to Magnet URIs by means of
>>>       adding a key:xv.
>>>
>>>     Embedding vectors could be new components of metalinks.
>>>     <metalink xmlns="urn:ietf:params:xml:ns:metalink">
>>>     <published>2009-05-15T12:23:23Z</published>
>>>     <file name="example.txt">
>>>     <url>http://www.example.com/example.txt
>>>     <http://www.example.com/example.txt></url>
>>>     <vector model="urn:ai:model:llama:3.3:70B">...</vector>
>>>     </file>
>>>     </metalink>
>>>
>>>
>>>       Appendix C: Custom Elements for Facts or Claims
>>>
>>>
>>>       A custom element could be used to signify an asserted fact or
>>>       claim, referring to an entry on a P2P network by means of
>>>       embedding vectors alongside other information. Via a
>>>       JavaScript library, and perhaps WebRTC, clients could
>>>       participate in P2P networks and retrieve entries, feedback on
>>>       entries, or both.
>>>
>>>     Notice that, for the special file type of entries, those
>>>     embedding vectors within them and not of the XML file itself are
>>>     utilized with respect to storing and addressing the resource on
>>>     P2P networks.
>>>     <verifiable-claim see="magnet:?xv=...">Ut enim ad minim veniam,
>>>     quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea
>>>     commodo consequat.</verifiable-claim>
>>>
>>>
>>>       Appendix D: Content Authoring with Shortcodes
>>>
>>>
>>>       How might authors easily add facts or claims to their content?
>>>       With respect to popular content-management systems, the syntax
>>>       for so doing could resemble that of existing shortcodes
>>>       like[quote].
>>>
>>>     [claim]Ut enim ad minim veniam, quis nostrud exercitation
>>>     ullamco laboris nisi ut aliquip ex ea commodo consequat.[/claim]
>>>     During content-publishing processes, authors' content-management
>>>     systems (e.g., Drupal, WordPress) – or configurable plugins or
>>>     extensions for these systems – could handle searching for
>>>     existing paraphrases, adding new facts or claims (if needed) to
>>>     P2P filesharing networks, obtaining the data for use in
>>>     thesee attributes, caching these data, and generating markup.
>>>
>>>     ------------------------------------------------------------------------
>>>     *From:* Emelia S.<emelia@brandedcode.com>
>>>     <mailto:emelia@brandedcode.com>
>>>     *Sent:* Monday, January 13, 2025 11:21 AM
>>>     *To:* Evan Prodromou<evan@prodromou.name>
>>>     <mailto:evan@prodromou.name>
>>>     *Cc:*public-swicg@w3c.org
>>>     <mailto:public-swicg@w3c.org><public-swicg@w3c.org>
>>>     <mailto:public-swicg@w3c.org>
>>>     *Subject:* Re: Fact-checking and community notes on the Fediverse
>>>     This is already something on the list of things that the
>>>     ActivityPub Trust  & Safety Taskforce is working on:
>>>
>>>     <4.png>
>>>     <https://github.com/swicg/activitypub-trust-and-safety/issues/4>
>>>     Idea: Annotations / Labeling of content · Issue #4 ·
>>>     swicg/activitypub-trust-and-safety
>>>     <https://github.com/swicg/activitypub-trust-and-safety/issues/4>
>>>     github.com
>>>     <https://github.com/swicg/activitypub-trust-and-safety/issues/4>
>>>
>>>
>>>     The Web Annotations model could work, but the discovery of
>>>     annotations that exist is the hardest part, I've started solving
>>>     that inhttps://github.com/ThisIsMissEm/annotations-service
>>>     <https://github.com/ThisIsMissEm/annotations-service> where I
>>>     use the sha256 hash of the Object ID as the annotation
>>>     collection ID, giving a very simple way to fetch all annotations
>>>     for a given object.
>>>
>>>     I do want to investigate what an Annotate activity would look
>>>     like, but I suspect this would just be an announcement of sorts
>>>     "hey, there's this web annotation over here for this target"
>>>
>>>     Yours,
>>>     Emelia
>>>
>>>         On 13 Jan 2025, at 04:23, Evan
>>>         Prodromou<evan@prodromou.name>
>>>         <mailto:evan@prodromou.name> wrote:
>>>
>>>         We don't have an easy way for remote actors to annotate
>>>         content on the Fediverse.
>>>
>>>         The biggest use case for this is to have permissionless
>>>         fact-checking or community notes. A fact-checking service
>>>         could annotate a remote content object like a Note or a
>>>         Video with additional fact-checking information, and
>>>         compliant clients or servers could show the fact-checking
>>>         information when showing the Note.
>>>
>>>         I think there are some tricky parts to this structure, which
>>>         I believe suggests that we should start working on it.
>>>
>>>         Evan
>>>
>

Received on Thursday, 23 January 2025 21:24:39 UTC