Re: AI catfishing [was Re: ChatGPT and ontologies] from Dave Raggett on 2023-02-18 (semantic-web@w3.org from February 2023)

From: Dave Raggett <dsr@w3.org>
Date: Sat, 18 Feb 2023 12:20:32 +0000
To: David Booth <david@dbooth.org>
Cc: semantic-web@w3.org
Message-Id: <700EE335-8AAD-4F1E-ABF3-769A89E55722@w3.org>

On the optimistic side, if AI can be used to generate harmful information, it can also be used to detect harmful information whether or not it was written by a human or an AI.  In this, I am inspired by recent work where a large language model was used to assess bias in a response. This was then used to further train the language model to reduce the level of bias in its responses.  A similar approach could be used for fact checking.  However, a social network might be unenthusiastic if such approaches dampen down the fire storms that drive engagement with their network.

> On 17 Feb 2023, at 22:44, David Booth <david@dbooth.org> wrote:
> 
> On 2/17/23 11:34, Hugh Glaser wrote:
> > I disagree, David.
> > The Spam-fighting arms race is an example of huge success on the
> > part of the defenders.
> 
> Very good point.  I guess I didn't adequately qualify my spam comparison.  Spam fighting has had a lot of success, however:
> 
> - Spam is generally trying to get you to click on an easily identifiable link, or selling a very specific product.  That's inherently MUCH easier to detect than deciding whether a message was written by a human vs a bot (as Patrick Logan also pointed out).
> 
> - Spam-fighting is MUCH better funded than your random spammer.  Think Google.  AI-generated influence messages -- including harmful disinformation -- will come from well funded organizations/adversaries.
> 
> - When one spam message gets through the spam filters, it generally causes very little harm -- a minor annoyance.  But if one AI-generated spear phishing campaign succeeds, or if an AI-generated propaganda campaign succeeds, the consequences can be grave.
> 
> So although spam fighting has had success, I don't see that success carrying over to distinguishing AI-generated content from human-generated content.  I think the continuing failure, of big social media companies (think Facebook and Twitter), to automatically distinguish human posts from bot posts, is already evidence of how hard it is to detect.  As AI improves I only expect the problem to get worse, because a well-funded adversary has two inherent advantages:
> 
> - When it is so cheap to generate fake content, even if only a small fraction gets past the fake-detection filters, that can still be a large quantity, and still harmful; and
> 
> - Defenders will always be one step behind, as the generators continually find new ways to slip past the detection filters.
> 
> So I guess I'm more in the Cassandra camp than the Pollyanna camp.
> 
> Best wishes,
> David Booth
> 

Dave Raggett <dsr@w3.org>

Received on Saturday, 18 February 2023 12:20:47 UTC