PyTorch, embeddings and Transformers ...

On a more technical note, I am coming up to speed with the PyTorch library for deep learning. I want to explore neural network architectures that are inspired by what we know about human cognition, given that the feedforward architecture used for large language models is far from how the human brain works.

There are lots of feedback connections in the brain, and I want to see how to apply that to artificial neural networks.  The phonological loop is just 2-3 seconds, and evidence from eye movements during reading and other experimental data shows that we process language sequentially, hierarchically and predictively. This is in contrast to large language models which process many thousands of words in parallel in a strictly feedforward algorithm.

I therefore want experiment with neural networks with text input limited to just a few words at a time. To provide the context for understanding, I will use feedback connections from deeper network layers. The feedback is driven by values retained from the previous step in the sequential processing model.

This approach focuses on the deep layers acting as working memory.   It further supports Type 2 processing - i.e. sequential deliberative thought, and will allow a network to continue cognition in the absence of text input or output, something useful for robots.  I also want look into how to integrate episodic and encyclopaedic memory to give the agent a sense of the past, present and future.  This involves bridging two different neural networks, one focusing on cognition and the other on memory, acting like a journal/diary.

PyTorch is a well thought library that makes it easy to work with tensors. However, I have some way to go still before I understand enough about text embeddings and Transformers to allow me to proceed with implementing new neural network architectures.  Is any on this list willing to answer a few questions on PyTorch and LLMs?

Many thanks,

Dave Raggett <dsr@w3.org>

Received on Wednesday, 8 November 2023 09:46:36 UTC