Re: AI catfishing [was Re: ChatGPT and ontologies] from Dan Brickley on 2023-02-18 (semantic-web@w3.org from February 2023)

From: Dan Brickley <danbri@danbri.org>
Date: Sat, 18 Feb 2023 16:16:10 +0000
To: Hugh Glaser <hugh@glasers.org>
Cc: David Booth <david@dbooth.org>, semantic-web@w3.org
Message-ID: <CAFfrAFo65KHqXsM6_QZamBh7-fMFWo+C-iZFMLVjAEHwcchNnA@mail.gmail.com>
It has been tried, implemented, debated, critiqued etc! Even openai had a
tool.

https://openai.com/blog/new-ai-classifier-for-indicating-ai-written-text/


It is a tougher problem with text generation than with images since latter
has much more scope for steganography and embedded metadata. Plus you are
unlikely to know whether you have made an “AI” detector or a GPT-x +
training set + finetuning regime + prompted context. And since anything
from a ouija board to a trillion dollar corporation can be seen as an AI,
the goal here isn’t particularly clear.

In my personal view this route is a terrible folly to follow.

Tooling using LLMs or better will be used as an aid for people with
challenges like having to work and apply for jobs in a 2nd, 3rd or 4th
language, or cognitive issues, dyslexia, or advanced summarisation for
blind users who are sick of slogging through giant verbose documents in
search of a simple claim or two. It would help nobody to stigmatize such
use.

LLMs can also help with writing in ways that do not directly create text.
Eg I had one write me a project plan for turning my silly movie script idea
into a movie. The use of LLMs encourages task decomposition in a very
“rubber duck programming” sort of a way.

Detecting text or ideas that passed through an LLM at some point (eg next
year) will be as hopeless as detecting text that has passed at some point
through bluetooth, or speech to text, or spell checking, or cleartext http:.

Dan

On Sat, 18 Feb 2023 at 12:23, Hugh Glaser <hugh@glasers.org> wrote:

> Thanks David,
>
> Then I guess the answer to my question is “No, no-one here knows anyone
> who has tried using LLMs such as GPT-3 to find out if text is human- or
> machine-generated”
>
> FWIW, I was thinking about at least a couple of ways of doing it.
> Firstly, systems could be directly trained; I think many people have been
> surprised at how functional the LLMs have been - maybe people would be
> surprised at how functional such detectors could be; I think this is like
> carrying forward spam detection-like processes.
> Secondly, the normal GPT-3s etc., could be used as-is, fed prompts of
> material, and asked if the author is human. This is the sort of thing I was
> thinking of; and improvements in generating tech would then be naturally
> tracked by the consequent improvement in detection.
>
> Interestingly, I see the launch of these LLMs as something of a
> singularity.
> In a few years (months?) it will be interesting to try and find large
> training sets where you are confident that you know whether the authors are
> human or LLM.
> I can see an ouroboros of LLMs being trained on each other, and even
> themselves.
> Training sets that predate this, or are guaranteed one or the other, while
> being sufficiently large, will be valuable.
>
> BTW.
> Spam detection is not the preserve of big, well-funded business - some of
> the best stuff is very much not from those sources.
>
> Cheers
> Hugh
>
> > On 17 Feb 2023, at 22:44, David Booth <david@dbooth.org> wrote:
> >
> > On 2/17/23 11:34, Hugh Glaser wrote:
> > > I disagree, David.
> > > The Spam-fighting arms race is an example of huge success on the
> > > part of the defenders.
> >
> > Very good point.  I guess I didn't adequately qualify my spam
> comparison.  Spam fighting has had a lot of success, however:
> >
> > - Spam is generally trying to get you to click on an easily identifiable
> link, or selling a very specific product.  That's inherently MUCH easier to
> detect than deciding whether a message was written by a human vs a bot (as
> Patrick Logan also pointed out).
> >
> > - Spam-fighting is MUCH better funded than your random spammer.  Think
> Google.  AI-generated influence messages -- including harmful
> disinformation -- will come from well funded organizations/adversaries.
> >
> > - When one spam message gets through the spam filters, it generally
> causes very little harm -- a minor annoyance.  But if one AI-generated
> spear phishing campaign succeeds, or if an AI-generated propaganda campaign
> succeeds, the consequences can be grave.
> >
> > So although spam fighting has had success, I don't see that success
> carrying over to distinguishing AI-generated content from human-generated
> content.  I think the continuing failure, of big social media companies
> (think Facebook and Twitter), to automatically distinguish human posts from
> bot posts, is already evidence of how hard it is to detect.  As AI improves
> I only expect the problem to get worse, because a well-funded adversary has
> two inherent advantages:
> >
> > - When it is so cheap to generate fake content, even if only a small
> fraction gets past the fake-detection filters, that can still be a large
> quantity, and still harmful; and
> >
> > - Defenders will always be one step behind, as the generators
> continually find new ways to slip past the detection filters.
> >
> > So I guess I'm more in the Cassandra camp than the Pollyanna camp.
> >
> > Best wishes,
> > David Booth
> >
>
>
>
Received on Saturday, 18 February 2023 16:16:35 UTC