Hello. This is Paola Di Maio presenting this work in my capacity of chair of the AIKRCG, which stands for Underficient Intelligence Knowledge Presentation Community Group at W3C, where a large part of this work is being done and shared. And, Jan Chin, thank you for presenting the paper at the conference in person in Barcelona. This is a pre-recorded talk. And this is me, I'm the voice. This is my face. The contents of the paper is outlined here. The main topic is Knowledge Representation, with a focus on Knowledge Representation Learning and the development of a vocabulary, which serves as a metadata set, as a type of subject index metadata. So the background is AI, which is enveloping everything and moving very fast. So it's doing AI is giving us unprecedented capabilities, you know, fields with a lot of open issues, uncertainties and risk factors. So AI is fundamentally rewriting history. So today we, we search for facts, an event is like a filter for everything that we know these days. It is also wiping our individual memory, like when we're forgetting to remember things, we are just now we're searching. This happened a little bit with the, with search engines. with search engines and with Google, mighty Google, already 25 years ago. But at the beginning, Google was indexing fairly accurately. And now because of the data explosion, the volumes of data being produced every day, much of which is noise, but systems, search engines, AI systems cannot distinguish what is noise from signal. So we are having a problem that we are writing, we're asking questions to AI about reality and the aspects, and AI is doing an excellent job of bringing things up. But at the same time is presenting results which have inherent bias, which is contributing possibly to distortions in the serious concern. And AI is learning from humans, ingesting intelligence and becoming autonomous and it's building itself. So we really don't know what AI is becoming. And this is one of the factors that motivated this work in the first place and say, "We want to understand what AI is becoming and how it's doing things, but how can we do that?" And we went to knowledge representation. So knowledge representation has been considered part of AI. The explicit representation of facts and rules and logic within that was leveraged by AI for reasoning. And in the age of machine learning, KR, however, has become less relevant to the point that people are saying no differentiation is not relevant to what we're doing today with neural networks. And this is this arguing, disagreeing with the basic argument was a starting point for this work seven years ago. And now we are seeing that people are figuring out, after we have hammered and written a lot of papers and done quite a lot of talks about it, and people are starting looking back at knowledge representation saying, "No, we do need knowledge representation for a number of things, even in machine learning." "Nonelless, since the beginning of knowledge representation field development, it has not been understood or defined practice not well." So it has been used in a small way by selecting a distinct knowledge precision technique to achieve specific results in the construction of intelligence system for a number of years. But as a field, it has been challenging to define because it's very vast and it doesn't, it's not just one thing. It has been challenging. And there are papers dated 20, 30 years ago, I don't have the citation in front of me, but, which were very clearly already identifying these challenges at the time, which was 50 years ago, 40 years ago, but today, this challenge is remaining. So knowledge representation as a field, still not defined in practice, it is not defined in practice, but today, this challenge is remaining. representation as a field still not defined in practice. It is becoming relevant to machine learning again. Still do not know exactly how to define it. And the work was started with this in mind saying we want to be able to say what is knowledge representation and how can we help us to solve the challenges and the open issues that machine learning is facing today. And we've been very busy since. The challenge for me has been to track the leading edge of where this everything is going. And what I'm presenting to you today is a little bit state-of-the-art the work as it stands today. So we started this trying to figure out knowledge representation as a domain. And what became compelling more recently is that certain the mission critical KR concepts that ensure the reliability of systems were completely missing in AI standards. In particular the truth preservation, which is a core KR concept, was noted as absent at the time of writing in all of the AI standards. Now you're going to ask me how did you figure out the standards? There is an initiative by the Turing Institute called AI Standards. was noted as absent at the time of writing in all of the AI standards. Now you're going to ask me how did you figure out the standards? There is an initiative by the Turing Institute called AI Standards Hub and it's searchable. So I don't know, the Turing Institute has been much criticized for a number of things, but praise to them for doing a searchable hub that allows to query, all AI standards by keyword. And at the time, trust participation was not in any of them. So, alarm. And then of course, there was some double checking by opening each standard individually and parsing it to make sure that the search engine wasn't just broken or missing. But, you know, this as far as as far as I was confident enough to make this assertions that the concept of truth preservation is absent in AI standards at the time of writing, which has been six points or throughout 25. So the lack of KR concepts in AI standards of certain KR concepts, critical KR concepts, such as truth preservation, is considered, can be considered a risk of AI failure. And if this is true, then, you know, these AI standards that are being developed may not be fit for purpose unless they are integrated with core knowledge, presentation concepts. So there are a number of risks of AI without KR. Opaikeness, transparency, inconsistency, increased system, which leads to the risk of AI. which lead to increased systemic risk and possibly systemic aberration, which is another big topic. If you're interested, you should be able to find the talk I recently gave on the topic. So, a number of papers and publications were written leading up to this work, if you're interested in the background. The scope of work presented here is a map of the knowledge domain called Artificial Intelligence Knowledge Representation, with a focus on knowledge presentation learning. The scope of work is to identify a list domain vocabulary. It's not in scope to build a full taxonomy or ontology at this time, although I'm sure with the right tools and resources we can do that. And it is not in scope to explain everything about AI, KR or metadata. So, for those of you who don't know or don't have the time to brush it off, brush it off, knowledge representation can be considered as a process or a method for encoding information in machine readable format to enable a machine to learn and act intelligent. Not one possible definition. And it uses diverse methods and tools to do this. So, in general, we're talking about knowledge presentation here. I need to emphasize that knowledge presentation is derived from KR in general. So, it supports reasoning. It's vital for explainability. It helps to decode the hidden layer. So, this is a new role for KR. I have a slide to explain this better later. Knowledge representation learning can be defined as a set of methods to encode symbolic knowledge into continuous vector spaces, so that AI systems can design and make predictions more effectively. Knowledge representation learning is about encoding symbolic knowledge in traditional AI, like rule-based systems or frame-based systems or knowledge-based systems, so to speak. But here we are translating a knowledge precision and it can help to translate the symbolic knowledge, the rules, the logic into continuous vector spaces, into machine learning constructs, so to speak. And it is important for a number of reasons, which I do not enumerate here, but for me, the work that I'm doing is, it connects knowledge presentation learning, and the more symbolic knowledge presentation. So we can see that knowledge presentation learning is somewhere in the middle. That's very interesting. So we are now mapping the wider knowledge presentation domain to see where KLR fits in. And KLR fits in here. This is what we are looking at today. This bigger picture is a definition, an attempt to define the knowledge representation domain as a whole. So, I say AI, because no representation is a field exists also outside AI. It can be used in a number of other fields, including legal design. there is a beautiful map of how long-representation relates to a number of fields, which is not just systems, and not just computer science. But here we're talking about KR for AI, and if we've started defining it in terms of subdomains or subcategories, starting from upper foundation, the existential level for what does AIKR consist of? Okay, so we use, we relate the knowledge-representation concepts to top level, as a top level ontology, using standard formalisms. And then here, we are looking at the number of domains. So we are saying, whatever AI is going to do is going to have an upper level, a foundational level, or top level ontology, an existential level, that defines the highest abstraction. And it's going to have a domain, an application domain. Oops, there is a double there. And the reliability of engineering has come in, because one of the biggest AI risks is the lack of reliability, so that especially generative AI, which is very smart, is not replicable. So from a systems reliability point of view, that is a problem. So I'm defining knowledge presentation in terms of reliability engineering somewhere else. And today I'm presenting this very briefly. this very briefly. So why are we doing this? It is to provide an index for communication and learning of the domain. And it can obviously support auditable, robust applications and it enables metadata-driven discovery and interoperability. So I must say that the word metadata, which is the keyword of interest for this conference, is here. Saying we're going to use the vocabulary as a metadata set for the subject matter domain knowledge presentation. So this is the subject matter domain. The vocabulary is going to be used as metadata and here it's listed as one of the uses. And very interesting is going to be to see how can we build automated monitoring. So how can we use it for evaluations of LLMs. in this respect. Methods, how do we do it? We identify subdomains, topics, pertinent topics. We identify core resources for each topic, for each subdomain. And I'm referring to these, these bubbles, subdomains. And then we extract key terms and concepts for each resource. We go around and round and we do a little bit of a doc. a bit creatively, so to speak. And with her, we extract concepts and terms and we try to clean them up, keep the relevant ones and the duplicate. And then we refine them via evaluations. So this is a general method for constructing core vocabulary. This is the slide I was referring to earlier, where traditionally knowledge presentation is used to encode logic and semantics in old-fashioned AI. But in machine learning today, we can use it to decode hidden layers. That's, I think, the most interesting aspect of the relevance of KR's machine learning today. And knowledge presentation learning sits somewhere here, together with the neuro-symbolic AI. Another big topic. So, somewhere in the knowledge organization, the spectrum of knowledge organization systems, this work stands here, but is the basis for whatever development and more structure, higher order development that we're going to follow. So, we started with symbolic logic from the old-fashioned AR. And we are arriving at the metadata set. So, we know all about metadata. But there are different types of metadata. So, we are looking at the metadata for subject indexing. So, the vocabulary presented here can be used as a metadata set for subject indexing of the domain knowledge precision learning. So, that's the idea. This is the little focus on truth maintenance systems. It was already mentioned at the beginning. So, it was originally a symbolic AI mechanism for consistency. And it tracks dependencies between beliefs and facts. And it revises beliefs when conflicts arise. And it is useful for hybridness symbolic machine learning systems. So, basically, it starts as a truth maintenance system. It can be rooted in the original symbolic AI, truth maintenance systems. But it is useful today in machine learning. It can help us to help us to... So, we cannot do without truth maintenance systems, so to speak. Even in machine learning today. Because it will enable the tracking of the dependencies. Nonetheless, as a concept was missing. So, it will support consistencies, updates, and it will ensure explainability and everything. So, finally, the vocabulary. It's a flat list. Definitions will be done later. This is just a list of words. And it's starting as benchmark for this... ...domain. Definition of the domain. At the moment, we have... This can be reached here. It should be viewable. But to edit. It's about 100 terms. So, people ask me what are the inclusion criteria. But everything that seemed to be core concept in KRL was included. So, looking at a corpus. You ask me how many papers. Honestly, I don't remember. I would have to look it up. But certainly, there was a page. A very useful page on GitHub that hosted a number of key papers from key conferences. And this created them all. Painstakingly. So, it's very important to grow into the class research impact. And at the moment, we're just taking out the terms from the corpus and compiling a list. So, how are we saying if this vocabulary is good or not? By checking that every... We pick a few papers. Not randomly, but based on... ...we're looking at the paper. So, we're looking at new papers. And we say, "Is this core concept in this paper in the vocabulary or not?" And... So, we're looking at quaternions, for example. So, we figured out that there is a new paper on quaternions. And I wonder, "Is quaternion in our vocabulary?" And it wasn't. It was missing. So, we've added it. This is how the current evaluation is currently done. So, what is quaternion? Quaternions is... They're embeddings for knowledge, presentation, and learning. So, they're core concepts. They should be in the vocabulary. And then, you can study the whole thing. And look at the example of the virus models that use quaternions. They say, "Represent entities and relations in a hyper complex space." It's a hyper complex space. To model complex relational patterns. Knowledge graph. Goodness, mate. You wouldn't want to miss on that. So, we look at the papers. And we say, "Okay, we... So, this is how the evaluation is done at the moment. Finding the papers. Checking that the core terms and concepts in the paper is in the vocab. So, I'm running out of time. We're also doing evaluations with use cases. So, looking at specific use cases where knowledge, representation learning is used and picking terms from there. So, from this work, a number of categories is emerging. So, we can analyze and create additional abstract layers of abstractions from the vocabulary. So, for example, so far, we have identified a number of categories in KLR. Translation-based, bilinear, deep neural, geometric, temporal, which could be used as further structure for the vocabulary and future iteration. So, so far, we can say that the vocabulary is very useful because it just tells us what KLR consists of. It starts indexing the topic, the domain. At the same time, it's far from being complete. And it's probably even a little bit dirty. It's a little bit noisy. So, there are terms in there which may not be purely KLR. Knowledge presentation or KR. Could be. We need to define what we're going to leave them, but we're going to delete. Then, of course, this has been done very coarsely. A little bit of experimental work. Definitions still not done. Further refinement needed and continuing the evaluation. So, we're going to continue with the evaluation. We're going to expand, refine, develop unique definitions. We create abstractions, further abstractions, further layers of structures. We're going to contribute to standards development, we hope. And maybe build an agent to do this work. So, wouldn't it be nice if we could, if someone could help us to do the AI for doing this. And this is an open call. Should have some flashing lights on this line. We're going to talk about the AI for the AI. And this is the most important and dynamic aspect of this field. The leading edge. Standardized vocabulary includes explainability and human learning. It is necessary to develop subject matter metadata. Bridges gap between symbolic and statistical AI. Contributes safe and auditory AI system. This is our super bottom line. So, thank you so much. You can check out the vocab. You can join by search. Search for this and join. And you're very welcome to shoot some questions here or wherever you like. So, get in touch. Thank you. Bye.