Re: Advancing Distributed Moderation

Emelia,

I was not aware that all of the mainstream ActivityPub implementations already digitally sign payloads, but that does make sense. I know a little bit about public key infrastructures<https://en.wikipedia.org/wiki/Public_key_infrastructure> and Web of Trust<https://en.wikipedia.org/wiki/Web_of_trust> systems, mostly from discussions.

That is also a good point about the environmental impact of proof-of-work-based solutions.

In the past, I thought about graph-based algorithms and approaches, solutions for measuring and distributing abstract trustworthiness and for providing initial scores with respect to the estimated factuality of end-users' content (see also: PageRank<https://en.wikipedia.org/wiki/PageRank>, Web of Trust<https://en.wikipedia.org/wiki/Web_of_trust>, h<https://en.wikipedia.org/wiki/H-index>-Index<https://en.wikipedia.org/wiki/H-index>, etc.). Along these lines, however, I more recently worry that such systems could slide a slippery slope to dystopian "social credit systems<https://en.wikipedia.org/wiki/Social_Credit_System>".

On the topics of distributed moderation and systems design, in theory, e-stamp or pay-to-post systems could dissuade users from kinds of misconduct including spamming. Different tiers of e-stamps or payments could provide end-users with differing content promotion and prioritization. It could be presented to end-users: "would you pay N1 cents to post, N2 cents more to post with a priority boost, such that you would also encounter much less spam and misconduct on the system".

With respect to such economic choices, the differences between 0 and 0.01 can be more than one cent<https://hbr.org/2011/06/competing-against-free>. End-users would have to enter their payment information and perhaps deposit some amount into their new account. So doing could be a simple component of account verification<https://en.wikipedia.org/wiki/Account_verification> per know your customer<https://en.wikipedia.org/wiki/Know_your_customer>.

It might be interesting to consider that (open source?) algorithms could estimate the contextually dynamic cost of a particular post based on a number of factors. That is, it needn't be a constant cost for any end-user to post any content to any number of any other end-users at any time.

It is noteworthy that end-users' posted content can simultaneously be viewed as adding value to systems. This value is seemingly measurable after the content is posted.

Combining these concepts (users paying dynamic amounts to post content which adds value to a system if and only if that content is quality, useful, and accurate), end-users might be willing to pay initially larger amounts to post content and subsequently receive quantities back per (open source?) algorithmic determinations which require observing systems after the content is posted, interacted with, and perhaps factually verified and/or corroborated with other content.

See also: attention economics<https://en.wikipedia.org/wiki/Attention_economy>.

Thank you for that information and yes, precisely, AI can provide new tools for moderators, augmenting and assisting.


Best regards,
Adam

________________________________
From: Emelia Smith <emelia@brandedcode.com>
Sent: Wednesday, November 29, 2023 8:38 AM
To: Adam Sobieski <adamsobieski@hotmail.com>
Cc: Evan Prodromou <evan@prodromou.name>; public-swicg@w3.org <public-swicg@w3.org>
Subject: Re: Advancing Distributed Moderation

Hi Adam,

You are aware that in all mainstream activitypub implementations that HTTP Signatures are used for signing the payload when sending activities to inboxes, right?

https://docs.joinmastodon.org/spec/security/#http


(Whilst this isn't the latest IEFT proposal, there is work starting to migrate up to that)

So unless you're suggesting that fediverse should borrow the most environmentally harmful aspect of blockchain technology (Proof of Work) for solving a relatively simple moderation problem (which is solved with naive bayesian filtering), I'm not sure I understand you.

Generally I'd recommend starting simple and adding complexity; and only reaching for computationally expensive options like AI If absolutely necessary (e.g., reaching for AI image labelling such that you don't need to send every image to Thorn or similar surveillance capitalism tech, and using it essentially as a pre-filter).

But as I mentioned previously: fundamentally moderation should be in the hands of humans, with them reviewing and taking action (along with an audit log of those actions), since we've seen the harms and horrors of algorithmic moderation at scale. Tech just augments and assists in that process.

Yours,
Emelia

On 29. Nov 2023, at 00:57, Adam Sobieski <adamsobieski@hotmail.com> wrote:


With respect to anti-spam techniques [1], in addition to Naive Bayesian [2], solutions to consider also include digital signatures [3], cryptographic solutions where quantities of computation are required to send messages, e.g., hashcash [4], and e-stamping [5].

In addition to text-based message contents, there are also image-based and other multimedia contents [6] for moderators' tools to contend with. Pertinent AI techniques for mitigating malicious multimedia messages include those from computer vision such as semantic scene graphs [7] and mapping content to embedding vectors [8][9]. Resultant high-dimensional mathematical vectors can be compared to one another in terms of the angles occurring between them, distances between them, distances between them and one or more complex regions of interest, and so forth.

As LLMs and multiagent systems (e.g., AutoGen [10]) utilize natural language and multimedia content in their interactions amongst themselves and with human end-users, there is a significant overlap between quality assurance with respect to these systems and moderating systems of human users.

Beyond email filtering [11] and related rules-based systems, e.g., [12], which can output Boolean values for messages, we can also envision functions which output, beyond fuzzy [13] or neutrosophic [14] values, other scalar or vector values for content. Earlier in this thread, however, Emelia Smith indicated that "algorithmic timelines" were unpopular here, which I understood to mean sorting users' timelines by relevance or predicted interestingness to individual users.

It could be the case that combinations of techniques would be useful. Computationally less expensive techniques could indicate those individual messages which were sufficiently interesting for subsequent processing with computationally more expensive techniques.

You raise a good point. Multi-server data useful to moderators and their tools would appear to include that data resulting from aggregating instances of bad content, e.g., towards ascertaining what malicious actors are up to at an instant.


Best regards,
Adam

[1] https://en.wikipedia.org/wiki/Anti-spam_techniques

[2] https://en.wikipedia.org/wiki/Naive_Bayes_spam_filtering

[3] https://en.wikipedia.org/wiki/Digital_signature

[4] https://en.wikipedia.org/wiki/Hashcash

[5] https://en.wikipedia.org/wiki/E-stamping

[6] https://en.wikipedia.org/wiki/Image_spam

[7] https://cs.stanford.edu/people/ranjaykrishna/sgrl/index.html

[8] https://en.wikipedia.org/wiki/Word_embedding

[9] https://en.wikipedia.org/wiki/Sentence_embedding

[10] https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat

[11] https://en.wikipedia.org/wiki/Email_filtering

[12] https://en.wikipedia.org/wiki/Sieve_(mail_filtering_language)
[13] https://en.wikipedia.org/wiki/Fuzzy_logic

[14] https://en.wikipedia.org/wiki/Fuzzy_set#Neutrosophic_fuzzy_sets


________________________________
From: Evan Prodromou <evan@prodromou.name>
Sent: Tuesday, November 28, 2023 10:39 AM
To: public-swicg@w3.org <public-swicg@w3.org>
Subject: Re: Advancing Distributed Moderation


Email filtering has benefited greatly from relatively simple and low-cost language processing methods such as naive Bayesian spam filtering<https://en.wikipedia.org/wiki/Naive_Bayes_spam_filtering>. These can balance a shared model (to allow collective protection) and personal models (to customize for individual needs), with weighting on each.

They don't require proprietary technology or big server farms. They are somewhat subject to introduced bias, depending on the training set, but the balance between shared and personal models can be set to deal with that.


You may be familiar with filters like spamassassin or bogofilter.


I haven't seen a lot of precedent for asking spammers and harassers to give consent to have their messages used for training public or private filters, and I don't intend to ever do it for my email inbox or for my social inbox.


Evan


On 2023-11-25 5:52 p.m., O'Brien, Sean wrote:
Thank you all for your thoughtful input in regard to moderation.

I would just like to add that there are (at least) four additional issues with AI/LLM in real-world scenarios:

* When we speak of "AI", we're talking about a handful of multi-billion-dollar backends owned / dominated by Big Tech intermediaries. Building solutions that require the usage of these systems puts entities like Microsoft, Alphabet, and Meta in an ever-more powerful position. In 1999 we were worried about Microsoft's monopolistic integration of IE in Windows. In 2006, the so-called "ASP loophole" was hotly debated during the GPLv3 drafting. In 2023, FOSS devs everywhere are inserting AI backends owned+operated by Big Tech into their software as essential, black-box middleware.

* AI/LLM has required the aforementioned piles of money to reach its current maturity, and that investment will eventually mean an end to freemium models or at least severe limitations on gratis usage, especially where API access and integration with FOSS projects is concerned. The infrastructure and energy requirements (read: costs) of OpenAI et al are rising rapidly, and any competitors attempting to meet similar scope and utility will carry huge financial and industrial baggage. All of this will place additional pressure on freemium revenue models. This is a prediction, of course, but nonetheless one borne from history and experience.

* We are starting to see real, tangible censorship of AI/LLM. That means there is default, upstream moderation before any possibility of downstream moderation, with little-to-no transparency from upstream.

* Any hypothetically ethical use of AI/LLM for moderation requires that users consent to their published works being used to train algorithms, be included in a training set, etc. In regard to FOSS licenses, I don't think I need to repeat the known issues with Microsoft's Copilot experiment for this list.

Cheers,
- Sean

--
Sean O'Brien

Lecturer, Yale Law School (Cybersecurity LAW 21314)
Fellow, Information Society Project at Yale Law School
Founder, Privacy Lab at Yale ISP, https://privacylab.yale.edu<https://privacylab.yale.edu/>




________________________________
From: Emelia Smith <emelia@brandedcode.com><mailto:emelia@brandedcode.com>
Sent: Saturday, November 25, 2023 5:04 PM
To: James <jamesg@jamesg.blog><mailto:jamesg@jamesg.blog>
Cc: Adam Sobieski <adamsobieski@hotmail.com><mailto:adamsobieski@hotmail.com>; public-swicg@w3.org<mailto:public-swicg@w3.org> <public-swicg@w3.org><mailto:public-swicg@w3.org>
Subject: Re: Advancing Distributed Moderation

I just wish to clarify, in my original reply, I said that whilst there could be applications for AI/LLMs/natural language models, the emphasis was that we need to be highly cautious and critical of such systems, given biases in training data and implementation.

That is, implementation of AI/ML without critically examining the systems will lead to further marginalisation and harm, NOT a reduction in harm.

There's been case after case of bad AI data, responses and faulty algorithms causing harm & chaos. Whatever steps you take around these technologies and applying them to the fediverse must be done with the utmost caution, and without assumptions of these AI models being inherently good.

There's a huge deal more that can be done in the moderation and trust & safety space before there's a _must_ for more advanced techniques. A good example of this is Mastodon's new android warnings with regards to replies: https://blog.joinmastodon.org/2023/11/improving-the-quality-of-conversations-on-mastodon/


Yours,
Emelia

On 25. Nov 2023, at 21:06, James <jamesg@jamesg.blog><mailto:jamesg@jamesg.blog> wrote:


OpenAI has documented a way in which you can use GPT-4 for content moderation: https://openai.com/blog/using-gpt-4-for-content-moderation


I expect there to be a lot more work done in this field of research.

James

On Saturday, 25 November 2023 at 20:00, Adam Sobieski <adamsobieski@hotmail.com><mailto:adamsobieski@hotmail.com> wrote:

Emelia Smith,

Thank you for the feedback. I am new to the SWICG and am glad to see that there is some interest here in uses of AI/ML/CV/LLMs for equipping moderators and empowering end-users.


Best regards,
Adam

P.S.: I would enjoy learning more about the FIRES proposal.

________________________________
From: Emelia Smith <emelia@brandedcode.com><mailto:emelia@brandedcode.com>
Sent: Friday, November 24, 2023 12:24 PM
To: Adam Sobieski <adamsobieski@hotmail.com><mailto:adamsobieski@hotmail.com>
Cc: public-swicg@w3.org<mailto:public-swicg@w3.org> <public-swicg@w3.org><mailto:public-swicg@w3.org>
Subject: Re: Advancing Distributed Moderation

Hi all,

Some of what Adam speaks of here is what I'm working on with my FIRES proposal, essentially a service for storing & sharing moderation advisories and recommendations, and looks past the current status quo of denylists or blocklists, allowing still for full federation restrictions, but also for more granular restrictions. It also allows for multiple different entity types, whether Domains/Instances, Hashtags, URLs, Users, etc.

When I worked at State in 2012/2013, we did structure opinions, where you'd have a topic and several words associated with that topic, and ultimately it didn't go mainstream.  So I'm skeptical on the "made me feel" proposal here, as it's additional "work" for the person using the service, maybe you could extrapolate this via emoji reactions, which are popular on the fediverse though?

Attempts at trying to classify content could help moderation, but it could also harm it, by reducing the "human touch", which creates problems like how Instagrams automated moderation often harms marginalised communities. So I do suggest caution in this path.

Additionally, this could then result in algorithmic timelines, which currently aren't present on the Fediverse. From what I've seen, there's generally a resistance to algorithmic timelines, even if those can help people who check social media less frequently than others.

You are right that AI/machine vision/LLMs could be beneficial for instance moderators, preventing exposure to harm, but we also need to be critical of and investigate the training of these models, and their human impact (there was a recent headline I read about OpenAI paying minimal amounts for human moderators to classify media as CSAM or not, which obviously has a human cost), additionally, we need to enquire as to the biases such algorithms may have and how they may adversely affect marginalised social groups.

Whilst tools can assist in moderation, ultimately I believe we need to have humans making the final decision.

(If anyone would like to peer-review FIRES I can send you an early draft, but I'm currently reworking a lot of it)

Yours,
Emelia Smith

On 24. Nov 2023, at 14:30, Adam Sobieski <adamsobieski@hotmail.com><mailto:adamsobieski@hotmail.com> wrote:

[truncated due to html content]

Received on Wednesday, 29 November 2023 16:11:48 UTC