Re: Cognitive agents as collections of modules from Timothy Holborn on 2024-03-11 (public-cogai@w3.org from March 2024)

From: Timothy Holborn <timothy.holborn@gmail.com>
Date: Tue, 12 Mar 2024 02:56:38 +1000
To: Dave Raggett <dsr@w3.org>
Cc: public-cogai <public-cogai@w3.org>
Message-ID: <CAM1Sok27SOdJiTwtMUKnbi9x7nSkeOrbja+YGwS33k6NWpE8rw@mail.gmail.com>
https://g.co/gemini/share/f5e773916b42



On Tue, 12 Mar 2024 at 02:46, Dave Raggett <dsr@w3.org> wrote:

> This is a progress report on work I am doing on cognitive agents as
> collections of cognitive modules.  Today’s large language models involve a
> stack of Transformers wedged between outer layers that deal with embeddings
> for tokens and positional information on the input side, and predicting the
> next token on the output side, where tokens are words, characters or some
> intermediary.
>
> The brain is highly modular and it makes sense to explore a modular
> approach to artificial neural networks.
>
> The modules all operate on working memory which I will treat as a vector
> that holds the latent semantics. This can be extended to a matrix for
> visual concepts. A vector is pretty flexible in that it can represent a
> single axis in a basis set, e.g. a given word, or a superposition of
> states, e.g. locations in a three dimensional space, or a chunk of
> name/value pairs or labelled directed edges in a graph.
>
> The richer representations yield noisy results when accessed,
> necessitating denoising. I don't understand how this works in practive in
> current language models! However, the effectiveness of large language
> models, and text to image models, is evidence that it works good enough.
> Transformers lack the expressive power to properly handle checks on parity,
> matching nested brackets, etc., but given enough layers manage to do a good
> enough job on human language and programming scripts.
>
>    -
>
>    *Encoder*
>
> This takes a sequence of tokens and constructs the latent semantics they
> imply. This involves self-attention and transformation.
>
>    -
>
>    *Decoder*
>
> This generates a sequence of tokens from the latent semantics. The decoder
> updates the latent semantics with positional information as each token is
> generated. This involves self-attention and transformation.
>
>    -
>
>    *Reasoner*
>
> This is a feedforward network that takes the latent semantics as its
> input, and provides an update to the latent semantics as its output. This
> is equivalent to a production rule engine for rules with a conjunction of
> conditions and a sequence of actions. Actions can also invoke functions on
> memory and external modules. Actions thus need to be able to describe which
> module they apply to, e.g. to trigger the decoder to output some text.
>
>    -
>
>    *Memory*
>
> This is a vector database that uses a vector as a query, and then updates
> the latent semantics based upon the best match found in the database.
> Recall is stochastic based upon similarity and activation levels, where the
> level decays over time, but is boosted upon access. The module further
> supports updates to existing vectors, as well as adding and deleting
> vectors. The query thus needs to be accompanied by the requested operation.
>
> Generative language models learn the semantics in order to predict the
> next word. Simply learning to regenerate the input text will fail to learn
> the semantics. However, the generative approach on natural language further
> requires the model to have lots of everyday knowledge, which necessitates a
> very large training dataset. Is there another way?
>
> It should be feasible to train a system that integrates the above modules
> using a relatively modest dataset using restricted language and semantics.
> The idea is to synthesise a dataset with taxonomic knowledge, basic logic
> and sets, causal knowledge and temporal relations, as well as simple
> arithmetic. I am working on how to create the dataset using a script.
> p.s. one of the challenges I am seeking help with is a means to collapse a
> superposition of states to a single state when you the vocabulary is not
> predetermined. This would allow a language model to generate a sequence of
> words, concurrently with another module that maps these words into
> characters or phonemes.
>
> Dave Raggett <dsr@w3.org>
>
>
>
>
Received on Monday, 11 March 2024 16:57:21 UTC