Re: Language models and reasoning from Timothy Holborn on 2022-11-23 (public-cogai@w3.org from November 2022)

From: Timothy Holborn <timothy.holborn@gmail.com>
Date: Wed, 23 Nov 2022 21:37:51 +1000
To: Dave Raggett <dsr@w3.org>
Cc: Public-cogai <public-cogai@w3.org>
Message-ID: <CAM1Sok0N=Fqvf3vOX=t-YGUUkGwmfXROXmNywCGn1Zsjs6Uirw@mail.gmail.com>
On Wed, 23 Nov 2022 at 20:00, Dave Raggett <dsr@w3.org> wrote:

> A number of large language models have been recently announced that claim
> to incorporate reasoning:
>
> Meta's Galactica[1] is a family of large language models trained on
> scientific texts, see the Galactica Explorer[2]. The website is full of
> hype, e.g. claiming to support reasoning, and the project has had uniformly
> bad reviews, e.g. "Is this really what AI has come to, automatically mixing
> reality with bullshit so finely we can no longer recognize the difference?"
> and “What bothers me so much about Facebook’s Galactica … is that it
> pretends to be a portal to knowledge … Actually it’s just a random bullshit
> generator.”, see the post by Alberto Romero[3].
>
> That matches my expectations as large language models and image generators
> are designed to stochastically generate plausible output following the
> statistics of the style selected by the prompt. The authors claim that
> Galactica does better than other large language models at mathematical
> reasoning with the exception of Minerva. Galactica is also positioned as a
> scientifically literate search engine, but is let down by its tendency to
> generate bogus text that appears authentic and highly confident.
>
> Google’s Minerva [4] is built on top of a large language model (Google
> PaLM) that was further trained on technical datasets. It correctly answers
> around a third of undergraduate level problems involving quantitative
> reasoning. However, it lacks a means to verify the correctness of the
> proposed solutions, as it is limited to intuitive reasoning.
>
> It works best when the prompt is given as one or more questions plus
> worked answers, followed by the question for Minerva to answer. Google
> refers to this as chain of thought prompting. This presumably provides
> semantic priming on the desired style of answer, analogous to keywords such
> as "anime, Ghibli style" for image generators like Stable Diffusion.
> Minerva demonstrates an ability to transform mathematical expressions from
> step to step, along with being able to carry out basic arithmetic
> operations.
>
> I think it is time to abandon the idiom of statistically generating text
> continuations to a prompt, and to instead focus on sequential deliberative
> reasoning that is open to introspection.  One potential way forward is to
> enable sequential operations on latent semantics as obtained by applying
> large language models to text utterances. This relates to the sequence to
> sequence models used for language translation, in respect to being used for
> mapping the latent semantics to a symbolic language that can be used to
> describe operations and their results.
>
> The activation levels for the neurons in upper layers of the artificial
> neural network, for the large language model, corresponds to working
> memory. This is by a text prompt. A sequential rule engine then manipulates
> working memory via a second network model, before generating the text
> output that corresponds to the updated latent semantics.  I haven’t
> implemented this as yet, and would like to collaborate with other people on
> this.  The DistilBERT large language model [5] is quite modest in size
> (e.g. 110 million parameters for the distilled base version of BERT), and
> as such avoids the need for the huge computing platforms available to well
> resourced companies.
>
> Anyone interested?
>

Yup.

but I'm finding / researching / collecting S/W library links[6] to support
a stack[7] and the work involved in defining what I'm generally talking
about creating[8].

Lots to do![9]

Yet, given - i have an enormous amount of respect for your works /
contributions / heritage; and the way, that may fit into a stack, whilst
noting, the present-day (and future) works here;  I'm also really
interested in gaining a better understanding of how some of the perhaps -
superfluous, older libraries - may not be the best choice, about what may
be better defined - into some sort of new 'Ai Os'; that's basically driven
via web interfaces...   Noting; that the big shift, that i've effectively
brought about - is the idea, of not seeking to put yourself as a digital
twin into the 'metaverse' (or someone else's cyber environment, etc.); but
rather, to create a new forms 'artificial minds' that we actually want -
and consideration about the tooling we need; to make them, operate
'safely'...  here's a list[10]; and i assume, we're not focused on tooling
for skynet or at least, not alone...

The modelling i'm doing / initiative i'm creating (new class of
computing?); SHOULD provide an advanced 'workstation' (home / advanced STEM
/ etc); 'server' (although, hard to define properly, atm); that provides a
methodology to produce stuff like 'knowledge cartridges' - therein, not
getting uploaded into a person's brain[11] (although, i hear, that's not
necessarily impossible within the foreseeable future) but rather, to into
your webizen.  but moreover; imo, gives a platform - to furnish an
opportunity to have a more meaningful conversation about AiEthics[12];
which, I suspect to be instrumental to this field of endeavour...

*Without any good answers that I was aware of; prior to working on
re:defining the concept of webizen (not 'netizen' or 'blockheads' or
whatever).*

Good to see  your post..  will dig into the links.

FWIW also; it would be good to create a chat-bot example, and I could use
your help, to help me figure out how to do it well.  Ideally, it could run
on jekyll, until i've got a solid instance up and running (although, i'm
not sure solid has the auth method using both WebID-TLS (devices ie: 23 Nov
2022[13]) and WebID-OIDC (users / persona 21 Jan 2015[14]); as to support
the intended semantic structures - whilst noting, that i have had confusion
about whether foaf is a protocol or ontology, etc.  anyhow. stumbled across
writings on http://dig.csail.mit.edu/ via archive dot org; which led, to me
questioning considerations i had about questioning why webid only talks
about foaf...  but that's all, a bit, off topic; similarly perhaps, Stephen
Grossberg[15] (via the google group on science of consciousness) has
encouraged me to read his book (or what he calls, his 'Magnum Opus')[16];
which i haven't had time to do yet, so i found and added some of the video
interview of him on one of my playlists[17]

Yet - the 'big shift' in my mind; has been this migration away from making
'human self' supportive systems; like a means to form a prosthetic
transposure of self; to the idea, of wilfully and intentionally, designing
'robots' (ai); not to be 'human' or mouse or elephant or dog or whatever;
but rather, to design them with intent; to be the first class of
'artificial species' intentionally designed, by mankind...

whilst understanding; scientists (in-effect) have been engineering plants /
animals, ie: cross breeding dogs & other animals, etc.. or all sorts of
things that happen in the world of flora.  (and also viruses, etc.).

but in this realm; its a different sort of thing.  which in-turn, solves my
'owl problem'[18], which ended-up being about 'tools lock-ins' more than
anything else; but the answers were moreover absent.

Best wishes,

Timothy Holborn.


> [1] https://galactica.org/static/paper.pdf
> [2] https://galactica.org/explore/
> [3]
> https://towardsdatascience.com/galactica-what-dangerous-ai-looks-like-f31366438ca6
> [4] https://minerva-demo.github.io/#category=Algebra&index=1
> [5] https://huggingface.co/distilbert-base-uncased
>
> [6]
https://docs.google.com/spreadsheets/d/1rqYC2E2BDIHBADAT7-9CabawkmYBJpBBf1KJO24D7ig/edit#gid=404000800
[7]
https://docs.google.com/presentation/d/1Soo3Rmk0jzEVgj4dl8F9P7RaHEC-cy8auk8N0QSC9fs/edit#slide=id.g19d62f6f1a7_0_93

[8]
https://docs.google.com/presentation/d/1Soo3Rmk0jzEVgj4dl8F9P7RaHEC-cy8auk8N0QSC9fs/edit?usp=sharing

[9] https://github.com/WebCivics/webizen.org-dev
[10]
https://docs.google.com/spreadsheets/d/1rqYC2E2BDIHBADAT7-9CabawkmYBJpBBf1KJO24D7ig/edit#gid=1503872436
[11] https://www.youtube.com/watch?v=w_8NsPQBdV0
[12]
https://drive.google.com/drive/folders/1uwGax8GvZA2jzJ_UFIoYppijZX4vDsoL
[13] http://mediaprophet.org/ux_KB/page4115294.html
[14] nb: use dummy data to get past form: http://dev.webcivics.org/
[15] https://en.wikipedia.org/wiki/Stephen_Grossberg
[16]
https://www.amazon.com/Conscious-Mind-Resonant-Brain-Makes/dp/0190070552
[17]
https://www.youtube.com/playlist?list=PLCbmz0VSZ_voTpRK9-o5RksERak4kOL40
[18] https://lists.w3.org/Archives/Public/public-cogai/2022Sep/

Dave Raggett <dsr@w3.org>
>
>
>
>
Received on Wednesday, 23 November 2022 11:38:46 UTC