Re: ChatGPT, ontologies and SPARQL from Adam Sobieski on 2023-01-24 (public-aikr@w3.org from January 2023)

From: Adam Sobieski <adamsobieski@hotmail.com>
Date: Tue, 24 Jan 2023 03:31:05 +0000
To: Adeel <aahmad1811@gmail.com>, ProjectParadigm-ICT-Program <metadataportals@yahoo.com>
CC: Dan Brickley <danbri@danbri.org>, "semantic-web@w3.org" <semantic-web@w3.org>, "public-aikr@w3.org" <public-aikr@w3.org>
Message-ID: <PH8P223MB0675B5A62214B44DADCB24F9C5C99@PH8P223MB0675.NAMP223.PROD.OUTLOOK.COM>
Also, Milton, thank you for your thought-provoking "two-trillion-dollar question" about the nature of AI systems that will be fully explainable, interpretable, transparent, and bias free. Expanding on your question, what will be the nature of the semantic concepts, categories, and relations between them be as curated and engineered?

These and related topics should invite some interesting and open-ended social and political discussions. Previous discussions about the interplays between language, politics, and society should be useful when exploring new ground with respect to these topics, the semantic concepts utilized by, frames utilized by, and natural language generated by new AI systems for large audiences.

I think that there will be always be some people who disagree to some extent with how an AI system phrased something, in particular when the topics were contemporary and relevant (which does occur with usage trends). Social media is presently abuzz with people sharing content generated by ChatGPT in response to their questions and interactions.

I wonder whether there are any guiding principles in these regards?


Best regards,
Adam

________________________________
From: Adam Sobieski <adamsobieski@hotmail.com>
Sent: Monday, January 23, 2023 2:15 PM
To: Adeel <aahmad1811@gmail.com>; ProjectParadigm-ICT-Program <metadataportals@yahoo.com>
Cc: Dan Brickley <danbri@danbri.org>; semantic-web@w3.org <semantic-web@w3.org>; public-aikr@w3.org <public-aikr@w3.org>
Subject: Re: ChatGPT, ontologies and SPARQL

I am impressed by ChatGPT, looking forward to learning more about other systems, like Sparrow, which can cite their sources, and perhaps to seeing combinations of these systems with the capabilities of learning from interactions with bulk end-users or with bulk operations personnel, per: https://blog.allenai.org/towards-teachable-reasoning-systems-dd16659fd9f8 . On these abstract topics, I proposed a related solution, Wikianswers, a while ago: https://meta.wikimedia.org/wiki/Wikianswers .

I, too, am increasingly interested in interpretability, explainability, and transparency with respect to artificial neural networks, in particular with respect to those systems which, moving forward, perform artificial semantic cognition, processing concepts, categories, attributes, features, relations, models, and other semantics constructs. In the article that I hyperlinked to, I express that these capabilities, the capabilities to decode artificial neural networks, will allow the field to move beyond defining success per the Turing test.

The spaces in the middle of transforming content between languages and between modalities are interesting to me, where there are opportunities for systems to be designed with artificial concepts and artificial semantic cognition.

If you haven't already, you might enjoy:


Wei, Jason, Yi Tay, Rishi Bommasani, Colin Raffel, Barret Zoph, Sebastian Borgeaud, Dani Yogatama, Maarten Bosma, Denny Zhou, Donald Metzler, Ed H. Chi, Tatsunori Hashimoto, Oriol Vinyals, Percy Liang, Jeff Dean, William Fedus. "Emergent abilities of large language models." arXiv preprint arXiv:2206.07682 (2022).

Scaling up language models has been shown to predictably improve performance and sample efficiency on a wide range of downstream tasks. This paper instead discusses an unpredictable phenomenon that we refer to as emergent abilities of large language models. We consider an ability to be emergent if it is not present in smaller models but is present in larger models. Thus, emergent abilities cannot be predicted simply by extrapolating the performance of smaller models. The existence of such emergence raises the question of whether additional scaling could potentially further expand the range of capabilities of language models.


As large language models increase in size, more useful abilities are emerging. In addition to the scientific exploration of interpretability, explainability, and transparency, there is the matter of increasing our understanding of the nature of this phenomenon.


Best regards,
Adam

________________________________
From: ProjectParadigm-ICT-Program <metadataportals@yahoo.com>
Sent: Monday, January 23, 2023 10:55 AM
To: Adeel <aahmad1811@gmail.com>
Cc: Dan Brickley <danbri@danbri.org>; Adam Sobieski <adamsobieski@hotmail.com>; semantic-web@w3.org <semantic-web@w3.org>; public-aikr@w3.org <public-aikr@w3.org>
Subject: Re: ChatGPT, ontologies and SPARQL

I was merely using examples to illustrate my points. I am definitely  not referring to use of USA based NLP or chatbot applications.

And you are right, interpretability and explainability and bias reduction, as well as transparency are necessary.

The latter two are the easiest to tackle, large collections of curated data sets from selected domains for which permission is given for their use and clear rules for their utilization should take care of a lot of problems.

Interpretability and explainability are much harder. As most AI striving to achieve AGI status is predicated upon the idea that the models of emulated processes in the human brain (for biologically inspired cognitive architectures) used to build algorithms fairly accurately represent how humans think and process information using formal methods, we must ask ourselves if we can ever achieve total interpretability and explainability.

ChatGPT, which we can clearly call a non-transparent application and definitely not passing the tests of interpretability and explainability is nevertheless able to perform some pretty impressive feats.

Case in point:
https://www.beckershospitalreview.com/hospital-physician-relationships/a-peek-into-healthcares-future-ai-passes-medical-licensing-exam.html


Any cognitive scientist, cognitive psychologist, neuroscientist and philosopher will readily tell you that the processes both mental and in the brain of humans that are used to learn, think, create and analyze cannot be neatly categorized and put in boxes.

The two trillion dollar question (the projected global market value in five years from now of AI) remains, if we want the AI be fully explainable, interpretable and transparent and ENTIRELY free of bias, what kind of AI will we end up with?

I myself am inclined to pay more attention to the ethics and controllability issues besides the other four issues.

Current legislation/regulation for intellectual property protection, data rights and privacy and now artificial intelligence in the UK and the EU is far from perfect.

Case in point: camera surveillance combined with facial recognition and artificial intelligence is used in many EU countries for national security and law enforcement purposes by authorities with mandates and oversight that are based on exemptions of said regulations and with cumbersome or restrictive right to information instruments for civilians to actually find out if their rights were unjustifiably violated.

Milton Ponson
GSM: +297 747 8280
PO Box 1154, Oranjestad
Aruba, Dutch Caribbean
Project Paradigm: Bringing the ICT tools for sustainable development to all stakeholders worldwide through collaborative research on applied mathematics, advanced modeling, software and standards development


On Monday, January 23, 2023 at 06:26:19 AM AST, Adeel <aahmad1811@gmail.com> wrote:


Hello,

All the embedding models are non-compliant to use in EU/UK region as they lack transparency. You can't really use them in regulated environments without loss of lineage in the data governance which is necessary for regulatory compliance. They probably need to prioritize interpretability and explainability within their models. But, likely they don't care because platforms like hugging face are based in USA where such trustworthy efforts are non-existent and not a priority.

1) how the result was produced
2) whether the model was correct in producing such a result based on the implementation
3) training dataset bias - bias debasing, etc.
4) how to resolve the bias laundering effect

Thanks,

Adeel


On Sat, 21 Jan 2023 at 15:09, ProjectParadigm-ICT-Program <metadataportals@yahoo.com<mailto:metadataportals@yahoo.com>> wrote:
Large language model interfaces with knowledge bases are a key ingredient for digital empowerment of all stakeholders in the promotion of sustainable development.

The sad reality is that of the 7,151 living languages (source: [http://www.ethnologue.com)]http://www.ethnologue.com) fewer than 200 are served currently in NLP applications and AI chatbots.

Even though there are linguistic tools available in principle to support digital environments for all languages with populations of speakers of e.g. 1,000 or more the situation is comparable to the pharmaceutical industry where only the discovery and development of new drugs for large enough markets is pursued.

So we can actually state that wide scale applications of chatbot applications and similar AI NLP application will only widen the digital divide. See more about this on https://www.sil.org.<http://www.sil.org.>

The GLIKI project (https://gliki.wordpress.com) was formulated to help bridge this divide. This project is soon going into execution mode.

It is also instructive to see what the IFLA, International Federation of Library Associations (https://www.ifla.org) is doing in terms of open access to knowledge.

Open access to digital libraries and knowledge repositories is crucial in developing countries and is currently a totally neglected and ignored subject with Big Internet Tech companies, yet these developing countries provide a huge potential customer base for internet services, in particular for online education, online healthcare, cloud and edge computing services.

Bilingual access (one international language and a native language) could boost internet services and stimulate local development of NLP and AI applications.

Milton Ponson
GSM: +297 747 8280
PO Box 1154, Oranjestad
Aruba, Dutch Caribbean
Project Paradigm: Bringing the ICT tools for sustainable development to all stakeholders worldwide through collaborative research on applied mathematics, advanced modeling, software and standards development


On Friday, January 20, 2023 at 12:39:15 AM AST, Adam Sobieski <adamsobieski@hotmail.com<mailto:adamsobieski@hotmail.com>> wrote:


Hello. I am also thinking about artificial neural networks, dialogue systems, and Semantic Web technologies, as Xavier asked about. I agree with Paola that natural-language interfaces to knowledgebases have been a while coming. Thank you, Dan, for pointing out: https://github.com/jerryjliu/gpt_index .

In the near future, it could be the case that content could move between the semantic working memories of artificial neural networks and external knowledgebases.

I would like to share a hyperlink to an article which I recently wrote. The article discusses artificial neural networks, (cognitive) semantics, and semantic cognition. The article is, however, more about DALL-E 2 than ChatGPT.

In the short article: I envision systems which can bidirectionally transform content pairwise between language, visual imagery, and semantics; I discuss the ideas of artificial concepts, categories, attributes, and relationships; and I show that varieties of multimodal semantics are already enhancing the performance of AI systems with respect to visual question answering, language-related, and vision-related tasks.

The article is available here: https://www.linkedin.com/pulse/artificial-neural-networks-semantic-cognition-adam-sobieski/ .

If you enjoy the article, please do like and share it on LinkedIn! Thank you.


Best regards,
Adam

________________________________
From: Dan Brickley <danbri@danbri.org<mailto:danbri@danbri.org>>
Sent: Thursday, January 19, 2023 12:56 AM
To: Paola Di Maio <paoladimaio10@gmail.com<mailto:paoladimaio10@gmail.com>>
Cc: SW-forum <semantic-web@w3.org<mailto:semantic-web@w3.org>>; W3C AIKR CG <public-aikr@w3.org<mailto:public-aikr@w3.org>>
Subject: Re: ChatGPT, ontologies and SPARQL



On Thu, 19 Jan 2023 at 04:14, Paola Di Maio <paoladimaio10@gmail.com<mailto:paoladimaio10@gmail.com>> wrote:
The semantic web has been waiting for natural language interfaces (well, at least I have)
for decades, ideally read and write
if this tool can be used like that then lets see it

@Danbri share results sometime?

Search twitter: chatgpt sparql
… lots of folk experimenting

Dan




On Wed, Jan 18, 2023 at 10:33 PM Dan Brickley <danbri@danbri.org<mailto:danbri@danbri.org>> wrote:


On Wed, 18 Jan 2023 at 14:21, Paola Di Maio <paola.dimaio@gmail.com<mailto:paola.dimaio@gmail.com>> wrote:
fyi - would be good to be able to generate sparql queries and interact with ontologies using natural language
thanks to this user for discovering this feature

In general these new large LLM models seem to have a weird effect on commentators: their ability to casually and confidently just make up answers, tends to distract from their less exciting but potentially transformative ability to kinda-sorta-kinda “understand” questions expressed in natural language.

Dan

ps. folk here might find
https://github.com/jerryjliu/gpt_index worth exploring, as a tool that supports idea of combining LLMs with external data. I have also had promising results using GPT-* etc to generate SPARQL…



---------- Forwarded message ---------
From: Paola Di Maio <paola.dimaio@gmail.com<mailto:paola.dimaio@gmail.com>>
Date: Wed, Jan 18, 2023 at 10:15 PM
Subject: Re: ChatGPT, ontologies and SPARQL
To: Contact - Cogsonomy <contact@cogsonomy.fr<mailto:contact@cogsonomy.fr>>
Cc: Semantic Web <semantic-web@w3.org<mailto:semantic-web@w3.org>>


assuming the syntax is good
 (your example validates here http://sparql.org/$/validate/query)
then it looks like a big YES
It would be amazing to have a tool that can translate natural language queries into sparql
and remove a major hurdle to using open ontologies on the web
Please, lets set up a catalogue of examples and evaluate the results
great functionality if it works
thank you


On Tue, Jan 17, 2023 at 10:41 PM Contact - Cogsonomy <contact@cogsonomy.fr<mailto:contact@cogsonomy.fr>> wrote:

Hi,

I asked chatGPT to create a sparql query for DBPedia by translating my question expressed in natural language : "Give me all the names of the cities located in France whose number of inhabitants is greater than 10,000".

Its answer was as follows :

here's the SPARQL query to answer the question:

PREFIX dbo: <http://dbpedia.org/ontology/><http://dbpedia.org/ontology/>
PREFIX dbr: <http://dbpedia.org/resource/><http://dbpedia.org/resource/>
SELECT ?city
WHERE {
  ?city a dbo:City;
        dbo:populationTotal ?population;
        dbo:country dbr:France.
  FILTER (?population > 10000)
}

This query uses the dbo prefix to access the DBpedia ontology and the dbr prefix to access the DBpedia resources. It selects the names of cities (?city) that are instances of dbo:City, have a total population (?population) greater than 10000, and have the dbo:country property with the value dbr:France (which represents France).

We can see on social networks several tests of chatGPT related to ontologies.  Do you think we can use chatGPT to query natural language ontologies (if we give it the structure before, of course)? And if we give a consequent corpus in learning to the GPT-3 algorithm (or its successor), do you think that it could build the ontology based on this corpus? More generally, will our vision of ontologies not be modified by the arrival and the rise of this kind of tool?

Best regards,

         Xavier
Received on Tuesday, 24 January 2023 03:31:21 UTC