Re: Advancing Distributed Moderation from Evan Prodromou on 2023-11-28 (public-swicg@w3.org from November 2023)

From: Evan Prodromou <evan@prodromou.name>
Date: Tue, 28 Nov 2023 10:39:35 -0500
To: "public-swicg@w3.org" <public-swicg@w3.org>
Message-ID: <1f8ac276-19c3-4c32-a262-e7ef38652bce@prodromou.name>
Email filtering has benefited greatly from relatively simple and 
low-cost language processing methods such as naive Bayesian spam 
filtering <https://en.wikipedia.org/wiki/Naive_Bayes_spam_filtering>. 
These can balance a shared model (to allow collective protection) and 
personal models (to customize for individual needs), with weighting on each.

They don't require proprietary technology or big server farms. They are 
somewhat subject to introduced bias, depending on the training set, but 
the balance between shared and personal models can be set to deal with that.


You may be familiar with filters like spamassassin or bogofilter.


I haven't seen a lot of precedent for asking spammers and harassers to 
give consent to have their messages used for training public or private 
filters, and I don't intend to ever do it for my email inbox or for my 
social inbox.


Evan


On 2023-11-25 5:52 p.m., O'Brien, Sean wrote:
> Thank you all for your thoughtful input in regard to moderation.
>
> I would just like to add that there are (at least) four additional 
> issues with AI/LLM in real-world scenarios:
>
> * When we speak of "AI", we're talking about a handful of 
> multi-billion-dollar backends owned / dominated by Big Tech 
> intermediaries. Building solutions that require the usage of these 
> systems puts entities like Microsoft, Alphabet, and Meta in an 
> ever-more powerful position. In 1999 we were worried about Microsoft's 
> monopolistic integration of IE in Windows. In 2006, the so-called "ASP 
> loophole" was hotly debated during the GPLv3 drafting. In 2023, FOSS 
> devs everywhere are inserting AI backends owned+operated by Big Tech 
> into their software as essential, black-box middleware.
>
> * AI/LLM has required the aforementioned piles of money to reach its 
> current maturity, and that investment will eventually mean an end to 
> freemium models or at least severe limitations on gratis usage, 
> especially where API access and integration with FOSS projects is 
> concerned. The infrastructure and energy requirements (read: costs) of 
> OpenAI et al are rising rapidly, and any competitors attempting to 
> meet similar scope and utility will carry huge financial and 
> industrial baggage. All of this will place additional pressure on 
> freemium revenue models. This is a prediction, of course, but 
> nonetheless one borne from history and experience.
>
> * We are starting to see real, tangible censorship of AI/LLM. That 
> means there is default, upstream moderation before any possibility of 
> downstream moderation, with little-to-no transparency from upstream.
>
> * Any hypothetically ethical use of AI/LLM for moderation requires 
> that users consent to their published works being used to train 
> algorithms, be included in a training set, etc. In regard to FOSS 
> licenses, I don't think I need to repeat the known issues with 
> Microsoft's Copilot experiment for this list.
>
> Cheers,
> - Sean
>
> -- 
> Sean O'Brien
>
> Lecturer, Yale Law School (Cybersecurity LAW 21314)
> Fellow, Information Society Project at Yale Law School
> Founder, Privacy Lab at Yale ISP, https://privacylab.yale.edu 
> <https://privacylab.yale.edu/>
>
>
>
>
> ------------------------------------------------------------------------
> *From:* Emelia Smith <emelia@brandedcode.com>
> *Sent:* Saturday, November 25, 2023 5:04 PM
> *To:* James <jamesg@jamesg.blog>
> *Cc:* Adam Sobieski <adamsobieski@hotmail.com>; public-swicg@w3.org 
> <public-swicg@w3.org>
> *Subject:* Re: Advancing Distributed Moderation
> I just wish to clarify, in my original reply, I said that whilst there 
> could be applications for AI/LLMs/natural language models, the 
> emphasis was that we need to be highly cautious and critical of such 
> systems, given biases in training data and implementation.
>
> That is, implementation of AI/ML without critically examining the 
> systems will lead to further marginalisation and harm, NOT a reduction 
> in harm.
>
> There's been case after case of bad AI data, responses and faulty 
> algorithms causing harm & chaos. Whatever steps you take around these 
> technologies and applying them to the fediverse must be done with the 
> utmost caution, and without assumptions of these AI models being 
> inherently good.
>
> There's a huge deal more that can be done in the moderation and trust 
> & safety space before there's a _must_ for more advanced techniques. A 
> good example of this is Mastodon's new android warnings with regards 
> to replies: 
> https://blog.joinmastodon.org/2023/11/improving-the-quality-of-conversations-on-mastodon/
>
> Yours,
> Emelia
>
>> On 25. Nov 2023, at 21:06, James <jamesg@jamesg.blog> wrote:
>>
>> 
>> OpenAI has documented a way in which you can use GPT-4 for content 
>> moderation: https://openai.com/blog/using-gpt-4-for-content-moderation
>>
>> I expect there to be a lot more work done in this field of research.
>>
>> James
>>
>> On Saturday, 25 November 2023 at 20:00, Adam Sobieski 
>> <adamsobieski@hotmail.com> wrote:
>>
>>> Emelia Smith,
>>>
>>> Thank you for the feedback. I am new to the SWICG and am glad to see 
>>> that there is some interest here in uses of AI/ML/CV/LLMs for 
>>> equipping moderators and empowering end-users.
>>>
>>>
>>> Best regards,
>>> Adam
>>>
>>> P.S.: I would enjoy learning more about the FIRES proposal.
>>>
>>> ------------------------------------------------------------------------
>>> *From:* Emelia Smith <emelia@brandedcode.com>
>>> *Sent:* Friday, November 24, 2023 12:24 PM
>>> *To:* Adam Sobieski <adamsobieski@hotmail.com>
>>> *Cc:* public-swicg@w3.org <public-swicg@w3.org>
>>> *Subject:* Re: Advancing Distributed Moderation
>>> Hi all,
>>>
>>> Some of what Adam speaks of here is what I'm working on with my 
>>> FIRES proposal, essentially a service for storing & sharing 
>>> moderation advisories and recommendations, and looks past the 
>>> current status quo of denylists or blocklists, allowing still for 
>>> full federation restrictions, but also for more granular 
>>> restrictions. It also allows for multiple different entity types, 
>>> whether Domains/Instances, Hashtags, URLs, Users, etc.
>>>
>>> When I worked at State in 2012/2013, we did structure opinions, 
>>> where you'd have a topic and several words associated with that 
>>> topic, and ultimately it didn't go mainstream.  So I'm skeptical on 
>>> the "made me feel" proposal here, as it's additional "work" for the 
>>> person using the service, maybe you could extrapolate this via emoji 
>>> reactions, which are popular on the fediverse though?
>>>
>>> Attempts at trying to classify content could help moderation, but it 
>>> could also harm it, by reducing the "human touch", which creates 
>>> problems like how Instagrams automated moderation often harms 
>>> marginalised communities. So I do suggest caution in this path.
>>>
>>> Additionally, this could then result in algorithmic timelines, which 
>>> currently aren't present on the Fediverse. From what I've seen, 
>>> there's generally a resistance to algorithmic timelines, even if 
>>> those can help people who check social media less frequently than 
>>> others.
>>>
>>> You are right that AI/machine vision/LLMs could be beneficial for 
>>> instance moderators, preventing exposure to harm, but we also need 
>>> to be critical of and investigate the training of these models, and 
>>> their human impact (there was a recent headline I read about OpenAI 
>>> paying minimal amounts for human moderators to classify media as 
>>> CSAM or not, which obviously has a human cost), additionally, we 
>>> need to enquire as to the biases such algorithms may have and how 
>>> they may adversely affect marginalised social groups.
>>>
>>> Whilst tools can assist in moderation, ultimately I believe we need 
>>> to have humans making the final decision.
>>>
>>> (If anyone would like to peer-review FIRES I can send you an early 
>>> draft, but I'm currently reworking a lot of it)
>>>
>>> Yours,
>>> Emelia Smith
>>>
>>>     On 24. Nov 2023, at 14:30, Adam Sobieski
>>>     <adamsobieski@hotmail.com> wrote:
>>>
>>>     
>>>     Social Web Incubator Community Group,
>>>
>>>
>>>       Introduction
>>>
>>>     Hello. I would like to share some ideas to better inform and
>>>     equip distributed social media administrators and moderators.
>>>
>>>
>>>       Multimedia-content-scanning Rules and Service Providers
>>>
>>>     Software tools for moderators can be envisioned which:
>>>
>>>      1. Scan multimedia content on servers,
>>>      2. Provide moderators with time-critical information, updates,
>>>         alerts, and alarms,
>>>      3. Provide moderators with real-time natural-language reports,
>>>         data visualizations, and analytics.
>>>
>>>
>>>     Software tools for moderators could be updated in a manner
>>>     resembling real-time "antivirus" data updates. Instead of there
>>>     being one or a few such "antivirus" data providers, moderators
>>>     could choose to subscribe to individual channels of data from
>>>     multiple providers. As envisioned, each provider would serve
>>>     updates across a number of described channels. Data sent in
>>>     these channels, e.g., multimedia-content-scanning rules, could
>>>     be merged on platform servers and subsequently utilized by
>>>     moderators' software tools. Some data providers' channels might
>>>     be free to use while others would require paid subscriptions.
>>>
>>>     As envisioned, moderators would be able to create, modify, and
>>>     delete multimedia content scanning rules manually to customize
>>>     their software tools.
>>>
>>>     Moderators' actions and decisions could be collected,
>>>     aggregated, processed, and utilized for purposes including to
>>>     distribute helpful real-time hints to other moderators across
>>>     platforms. Moderators would be able to choose which service
>>>     providers, if any, to share these data with. These data would
>>>     not identify end-users or moderators.
>>>
>>>     The aforementioned "antivirus" data for informing and equipping
>>>     decentralized social media moderators could additionally enhance
>>>     and enable other technologies and systems, e.g.,
>>>     content-distribution algorithms and recommender systems.
>>>
>>>     With recent advancements to AI:
>>>
>>>     1.
>>>         Systems like NeMo Guardrails
>>>         <https://github.com/NVIDIA/NeMo-Guardrails> and Guardrails
>>>         AI <https://github.com/guardrails-ai/guardrails> can enhance
>>>         the capabilities of moderators' software tools,
>>>      2. Systems like GPT-4V
>>>         <https://openai.com/research/gpt-4v-system-card> and LLaVA
>>>         <https://github.com/haotian-liu/LLaVA> can process images
>>>         occurring in social-media contexts in new ways.
>>>
>>>
>>>       Enabling More Granular Reactions to Content
>>>
>>>     Beyond liking content or not, end-users could react to content
>>>     items by attaching text-keywords metadata. End-users could click
>>>     on buttons next to content to enter open-ended text contents
>>>     into text fields, e.g., "this content made me feel _____". There
>>>     could be another button to use to add subsequent reactions –
>>>     second, third, or fourth reactions – for more complex, and
>>>     potentially mixed, reactions. Alternatively, these reactions
>>>     could be comma-delimited, and converted to reaction tags as
>>>     typed. End-users' reactions would tend to be drawn from a
>>>     folksonomic vocabulary
>>>     <https://en.wikipedia.org/wiki/Emotion_classification> and,
>>>     accordingly, incremental search and recommendation features
>>>     could reduce typing.
>>>
>>>     Anonymized usage data from end-users could be collected and sent
>>>     to service providers. Envisioned data include anonymized status
>>>     messages and anonymized reactions to content items. With these
>>>     data, new usage trends, audience reaction data, natural-language
>>>     reports, data visualizations, and analytics could be made
>>>     available to moderators and administrators.
>>>
>>>     With unfolding advancements to AI, granular reactions to content
>>>     could be explained and predicted. This could enhance tools for
>>>     moderators such as automatically displaying warnings atop some
>>>     content for end-users.
>>>
>>>
>>>       End-user-specified Preferences
>>>
>>>     Some moderators might want to allow their end-users to express
>>>     content-related preferences, to allow end-users to directly or
>>>     indirectly create, modify, and delete
>>>     multimedia-content-scanning rules.
>>>
>>>     End-users' preferences can be complex; there would be some
>>>     complex concepts or categories for machine-learning-based
>>>     approaches to learn through the processing of items and
>>>     end-users' responses to them. Consider, for instance, the
>>>     following pseudocode for two rules, which end-users might
>>>     express through a number of techniques, including natural
>>>     language: "if an item is predicted to make me sad or angry,
>>>     unless it is for activism or charity, I want a warning atop it."
>>>
>>>     *Rule 1:*
>>>     *
>>>     *
>>>     if(prediction(me, item, 'sad'))
>>>     {
>>>       if(!(metadata(item, 'activism') || metadata(item, 'charity')))
>>>       {
>>>           return warning.content(item, 'sad');
>>>       }
>>>     }
>>>     return warning.none;
>>>
>>>     *Rule 2:*
>>>     *
>>>     *
>>>     if(prediction(me, item, 'angry'))
>>>     {
>>>       if(!(metadata(item, 'activism') || metadata(item, 'charity')))
>>>       {
>>>           return warning.content(item, 'angry');
>>>       }
>>>     }
>>>     return warning.none;
>>>
>>>     Note that multiple warnings could be aggregated and displayed
>>>     atop content items.
>>>
>>>     AI systems, i.e., large language models, could process
>>>     end-users' natural-language expressions of content-related
>>>     preferences into multimedia-content-scanning rules.
>>>
>>>     In theory, end-users' preferences could be collected,
>>>     aggregated, and processed. These data would not identify end-users.
>>>
>>>
>>>       Conclusion
>>>
>>>     Existing and new standards and recommendations can enable
>>>     indicated technology scenarios for better informing and
>>>     equipping distributed social media administrators and moderators.
>>>
>>>     It would be great if, after some amount of delay, these kinds of
>>>     data, indicated above, could be made available to the scientific
>>>     community and to the public.
>>>
>>>     Thank you. Any comments, questions, or thoughts on these ideas?
>>>
>>>
>>>     Best regards,
>>>     Adam Sobieski
>>>
>>>     P.S.: See also:
>>>
>>>       * https://github.com/swicg/general/issues/34
>>>         <https://github.com/swicg/general/issues/34>
>>>       * https://github.com/swicg/general/issues/7
>>>         <https://github.com/swicg/general/issues/7>
>>>       * https://github.com/w3c/activitypub/issues/231
>>>         <https://github.com/w3c/activitypub/issues/231>
>>>       * https://github.com/w3c/activitypub/issues/232
>>>         <https://github.com/w3c/activitypub/issues/232>
>>>
>>>
>>>
>>
>> <publickey - jamesg@jamesg.blog - 0xC06B40B5.asc>
Received on Tuesday, 28 November 2023 15:39:51 UTC