Re: Vector embeddings for production rules from Dave Raggett on 2023-12-22 (public-cogai@w3.org from December 2023)

From: Dave Raggett <dsr@w3.org>
Date: Fri, 22 Dec 2023 14:59:06 +0000
To: Matteo Bianchetti <mttbnchtt@gmail.com>
Cc: public-cogai <public-cogai@w3.org>
Message-Id: <AF732BCA-2238-4503-BF88-63ADBA8BA310@w3.org>
Hi Matteo,

Send me details for the typo in the README and I will fix it.

Happy to speak to you in the new year.   I am pretty busy in January, so sometime in February would be better.

In respect to the use of vector embeddings for production rules, I’ve found a few references which look very relevant.
Embedding logical queries on knowledge graphs <https://proceedings.neurips.cc/paper_files/paper/2018/file/ef50c335cca9f340bde656363ebd02fd-Paper.pdf>, 2018, Hamilton et al.

Logical queries and edge predictions on graph embeddings in low dimension spaces. Conjunctive queries are mapped to embeddings and used for approximate nearest neighbour search against the graph.

Knowledge Graph Embedding: A Survey from the Perspective of Representation Spaces <https://arxiv.org/pdf/2211.03536.pdf>, 2023, Jiahang Dao et al. We explore the advantages of mathematical spaces in different scenarios and the reasons behind them.

Approximate nearest neighbors: towards removing the curse of dimensionality <https://www.theoryofcomputing.org/articles/v008a014/v008a014.pdf>, 1998, Indyk and Motwani.

The means to compute embeddings for conjunctive queries is just what is needed for indexing simple production rules, followed by stochastic selection of matching rules using approximate nearest neighbour search. The survey of knowledge graph embeddings also looks good in respect to use in episodic and encyclopaedic memory, e.g. using different embeddings for entities and relations.

In a cognitive agent we would like to balance intuition and deliberation, for this, it is useful to consider the distinction between implicit memory and explicit memory.  The former supports intuition and everyday skills, and is the basis for today’s generative AI, based upon gradient descent and training against large datasets. The latter covers the things you remember individually, e.g. your last birthday. Explicit memory can be likened to a database with read, write and update operations.

The implementation of explicit memory should support machine learning, e.g. induction of generalisations and specialisations across instances, including entities and relations. For plausible reasoning we also need soft metadata as a basis for inferences.  Explicit memory should further mimic human memory in respect to the forgetting curve, spacing effect and spreading activation, as these are important in respect to recalling what is important and ignoring what is not.

This points to plenty of opportunities for experimentation.

Best regards,
    Dave

> On 22 Dec 2023, at 13:45, Matteo Bianchetti <mttbnchtt@gmail.com> wrote:
> 
> Hi Dave, 
> 
> Thanks so much for the email. I took some time to read through the references and study part of the github repository. I would be very happy to help with the project that you describe.
> 
> On another matter, I tried to fix a typo in the README and push a new version in a branch, but got an error message (permission denied). Maybe we can talk about this too whenever you are free. I am happy to meet almost any time (I am in the ET time zone).
> 
> Thanks very much,
> Matteo
> 
> On Mon, 18 Dec 2023 at 10:44, Dave Raggett <dsr@w3.org <mailto:dsr@w3.org>> wrote:
>> I’ve been studying ideas for implementing Type 1 & 2 cognition based upon vector embeddings for production rules. This is inspired by the role of production rules in symbolic cognitive architectures such as ACT-R and SOAR, as well as this community group's work on chunks & rules.
>> 
>> Some key papers include:
>> 
>> 1) Neural machine translation by jointly learning to align and translate, 2015, Bahhanau, Cho and Bengio, see: https://arxiv.org/abs/1409.047
>> 
>> 2) Attention is all you need, 2017, Ashish Vaswani et al., see: https://arxiv.org/abs/1706.03762
>>   
>> 3) Neural Production Systems, March 2022, Aniket Didolkar et al, see: https://arxiv.org/pdf/2103.01937.pdf
>> 
>> A production rule system determines which rules match the current state of working memory, stochastically selects the best matching rule, and applies it to update working memory. Rules include variables as a basis for generalisation. An artificial neural network can be designed to learn rules through reinforcement learning.
>> 
>> The first reference above describes how English can be translated to French using a mechanism to determine the soft-alignment of the current word with the preceding and following words. The second reference introduces Transformers as model of self-attention that can be executed in parallel, and forms the basis for today’s large language models (e.g. ChatGPT) which statistically predict text continuations to a user supplied prompt. The third reference extends these ideas to show how attention supports the process of matching rule conditions to working memory.
>> 
>> I am hoping to apply aspects of all three papers in new work on applying production rules to Type 2 cognition, i.e. sequential deliberative cognitive steps as a basis for reasoning. This can be thought of as reimplementing chunks & rules in neural networks. This will exploit feed-backward connections for retained state in combination with the feed-forward connections found in existing language models. I am looking forward to implementing experimental versions of these ideas in PyTorch. 
>> 
>> Any offers of help would of course be very welcome!
>> 
>> p.s. this is part of a roadmap for work including natural language processing and continual learning based upon integrating episodic and encyclopaedic memory.
>> 
>> Best regards,
>> 
>> Dave Raggett <dsr@w3.org <mailto:dsr@w3.org>>
>> 
>> 
>> 

Dave Raggett <dsr@w3.org>
Received on Friday, 22 December 2023 14:59:21 UTC