Re: ChatGPT, ontologies and SPARQL

Dave,

By "bridging", I meant a consideration of both the cognitive (e.g., PKN) and the neural (artificial neural networks) aspects of to-be-explored systems and willingness to edition each, in particular the cognitive side, based on observations of those resulting systems. Sort of resembling how bridges and tunnels can be built from both sides to middle points.

I should have explained that better in the previous email and the matter is probably one more applicable to my own sketches of a "cognitive API" where I envision providing AI researchers and developers with means of connecting to remote large-scale artificial neural networks, e.g., running on the cloud, and interacting with those systems at a cognitive abstraction level with interfaces like: "ISemanticWorkingMemory" and "IConcept".

Thank you for the feedback on the machine-utilizable chain-of-thought topic. Machine-utilizable chain-of-thought capabilities would be useful for scenarios including verifying and validating the reasoning of large-scale artificial neural networks automatically, utilizing logic-programming systems. I'm picturing developers running batch unit tests on remote artificial neural networks, e.g., after training them.

In my sketches, I think about the "unnamed concepts and relations" which you indicate as each being automatically assigned a "tracking label", perhaps a GUID. I also think about how best to provide developers with "concept-to-text" and "text-to-concept" capabilities. The prior capability, "concept-to-text", would be useful for purposes including displaying and hopefully explaining to developers what a given (or selected) concept, category, relation, etc., was, and the latter capability, "text-to-concept", would, similarly, also be useful. These capabilities could be considered as content transformations, semantics-to-language and language-to-semantics.


Best regards,
Adam

________________________________
From: Dave Raggett <dsr@w3.org>
Sent: Thursday, January 26, 2023 5:40 AM
To: Adam Sobieski <adamsobieski@hotmail.com>
Cc: Adeel <aahmad1811@gmail.com>; public-aikr@w3.org <public-aikr@w3.org>
Subject: Re: ChatGPT, ontologies and SPARQL



On 25 Jan 2023, at 20:21, Adam Sobieski <adamsobieski@hotmail.com> wrote:


Dave,



Thanks for sharing the slideshow and related demos (https://www.w3.org/Data/demos/chunks/reasoning/).

I am interested in how semantics models and systems will emerge from the continued study of various kinds of artificial neural networks (as they scale) and how technologies like PKN might interoperate with these systems. Is "neurosymbolic bridging" a goal of PKN?

Can you explain what you mean by bridging and why you think it is important?

For me, one opportunity is to enrich large language models with domain knowledge, where we use PKN as a structured training input complementing natural language resources. Another is to support introspection and System 2 cognition. We could even seek to wed a large language model with a cognitive database that combines neural and symbolic reasoning. If we want to “mine” large language models to export symbolic knowledge as graphs, I suspect that will be harder and of limited utility.


With respect to detecting and tracking semantics constructs (concepts, attributes, relations, etc.) as occurring in various kinds of artificial neural networks, there are to consider: (1) embedding vectors, (2) activities occurring across populations of neurons, potentially utilizing other neural networks to “decode” them, (3) graph neural networks, and (4) other approaches.

Quite so. My plan is to train a neural network to a) decode latent semantics and b) operate on them via transformations of different kinds. Decoding is analogous to human language translation, i.e. sequence to sequence transformation, so we can build upon the experience gained in training translation systems.


On the topics of embedding-vector-based approaches to analogical reasoning, if you haven’t already, you might enjoy:

Holyoak, Keith J., Nicholas Ichien, and Hongjing Lu. "From semantic vectors to analogical mapping." Current Directions in Psychological Science 31, no. 4 (2022): 355-361.

See also the link in my slides on slide 16 to the PhD by Steven Derby who studied the kinds of knowledge associated with activation vectors at different layers of the networks.


Also, more theoretically, I am interested in: (1) how multi-task learning, transfer learning, and the emergence of abilities, as neural networks scale, might interplay with performance-improving chain of thought approaches, and (2) how machine-utilizable chain of thought (beyond the natural-language variety) could be interoperable with "neurosymbolic bridging" approaches. In my opinion, it will be exciting when artificial neural networks can output machine-readable reasoning.

Given that large language models reason with imprecise and unnamed concepts and relations, it will depend upon what problem you are trying to solve.

In my opinion it will be more valuable to focus on mimicking human memory so that cognitive agents can store and recall what they were thinking, and can learn continuously. When combined with introspection, this enables a sentient dialogue and AGI.

ChatGPT has a static long term memory and a per user session working memory.  It doesn’t have any goals or any memory of what it was doing beyond the current user session. Imagine, that it was able to remember for much longer than the current session, and could further introspect about its goals, and how to solve problems of interest.  We could then improve its knowledge and enable it to work with a theory of mind. This needs to embrace privacy and confidentiality, so what it learns about you in confidence is not shared with anyone else. One way to support that is with a personalised network per user that works in conjunction with a multi-user network. There may be other ways, though.

My hunch is that researchers can explore these ideas and more using modest resources and smaller models, without requiring the huge financial resources available to Google, Facebook and OpenAI. One open question is whether there is anything fundamental about the scale of the models and training sets that would preclude useful systems with smaller models. My guess is no, and it is a matter of improving the network architecture and training algorithms to make more from less training data.  Humans are good at this, we know it is possible, we just need to discover how!




Best regards,

Adam

________________________________
From: Adeel <aahmad1811@gmail.com>
Sent: Wednesday, January 25, 2023 9:15 AM
To: Dave Raggett <dsr@w3.org>
Cc: Adam Sobieski <adamsobieski@hotmail.com>; public-aikr@w3.org <public-aikr@w3.org>
Subject: Re: ChatGPT, ontologies and SPARQL

Hello,

Do you think their choice of GPT was a good choice which has a bias towards the decoder architecture?
What if it was chatT5 a balance of encoder-decoder architecture layers in a LLM with RL. Would make it more flexible.
One could still use IFT (instruction fine tuning), SFT (supervised fine-tuning), RLHF (reinforcement learning from human feedback), and CoT (chain of thought).
Where IFT is a tiny fraction, SFT uses human annotations, CoT improves on the model performance for given tasks.

Thanks,

Adeel



On Wed, 25 Jan 2023 at 14:04, Dave Raggett <dsr@w3.org<mailto:dsr@w3.org>> wrote:
Hi Adam,

You can see more about my proposed research roadmap at:

https://www.w3.org/2023/02-Raggett-Human-like-AI.pdf

Slides 5 and 6 include examples of the kinds of reasoning that ChatGPT can do, showing that it is well beyond what is practical today with RDF and the Semantic Web. There is however a great deal more research needed to evolve from ChatGPT to practical everyday artificial general intelligence. See slide 7 for a summary of what’s needed.

Best regards,
Dave

On 25 Jan 2023, at 06:47, Adam Sobieski <adamsobieski@hotmail.com<mailto:adamsobieski@hotmail.com>> wrote:

Dave,

Thank you and I agree with your points. I’m excited about the near future when scientists will better understand the emergence of abilities in large-scale artificial neural networks. A related topic is that of multi-task learning (https://en.wikipedia.org/wiki/Multi-task_learning).

I'm also excited about artificial neural networks interoperating with external knowledgebases. This interoperation could utilize SPARQL.

With respect to large-scale dialogue systems like ChatGPT, some topics that I previously considered in the contexts of intelligent tutoring systems include dialogue context, user modeling, and A/B testing.

Large-scale systems like ChatGPT are engaging with a large volume of users at an instant, over the course of a day, and with, I imagine, some task redundancy. Large-scale dialogue systems which can measure or infer quality of service and user satisfaction could vary their outputs, e.g., their framings or phrasings, over populations of users to explore whether, which, and why variations result in increased quality of service or user satisfaction.

Beyond providing users with buttons with which to obtain their feedbacks, e.g., a thumbs up button, scientists could also explore more clever means of measuring user comprehension and satisfaction. If a user exits a dialogue quickly after an answer is provided, was their question answered directly and to their satisfaction or, did they, instead, in frustration, seek another system to engage with? If a user responds to an AI system more rapidly in a dialogue, was the AI system’s previous content phrased in a more readable and comprehensible manner for that user and/or was the dialogue more engaging? We can also consider that video-chat-based dialogue systems should have more data to utilize towards ascertaining quality of service and user satisfaction during dialogues.

These techniques would be milestones on the journey to metadialogical capabilities, where dialogue systems could receive feedback about their dialogues in those dialogues.

In addition to the topics of semantic content curation and engineering (which might be otherwise-named subtopics of operating the systems), there are also to consider large-scale AI systems which can perform A/B testing over populations of users to maximize quality of service and user satisfaction. In my opinion, care should be taken to avoid the pitfalls of personalization, e.g., filter bubbles.


Best regards,
Adam

________________________________
From: Dave Raggett <dsr@w3.org<mailto:dsr@w3.org>>
Sent: Tuesday, January 24, 2023 4:18 AM
To: Adam Sobieski <adamsobieski@hotmail.com<mailto:adamsobieski@hotmail.com>>
Cc: Adeel <aahmad1811@gmail.com<mailto:aahmad1811@gmail.com>>; public-aikr@w3.org<mailto:public-aikr@w3.org> <public-aikr@w3.org<mailto:public-aikr@w3.org>>
Subject: Re: ChatGPT, ontologies and SPARQL

Dropping back to AIKR ...


Scaling up language models has been shown to predictably improve performance and sample efficiency on a wide range of downstream tasks. This paper instead discusses an unpredictable phenomenon that we refer to as emergent abilities of large language models. We consider an ability to be emergent if it is not present in smaller models but is present in larger models. Thus, emergent abilities cannot be predicted simply by extrapolating the performance of smaller models. The existence of such emergence raises the question of whether additional scaling could potentially further expand the range of capabilities of language models.

A deeper understanding of how ChatGPT is able to generate its results should allow us to devise smaller and more climate friendly systems.  Practical applications don’t need the vast breadth of knowledge that ChatGPT got from scraping most of the web.

A deeper understanding will also facilitate research on fixing major limitations of large language models, e.g. continuous learning, integration of explicit domain knowledge, metacognition, introspection and better explanations that cite provenance, etc.

Dave Raggett <dsr@w3.org<mailto:dsr@w3.org>>

Dave Raggett <dsr@w3.org<mailto:dsr@w3.org>>

Dave Raggett <dsr@w3.org>

Received on Friday, 27 January 2023 04:26:47 UTC