Re: Semantic / CogAI filesystem?

I've 'ChatGPT'd' (v4) the question;   "If LLMs are now defined as stuff
like GPT / Transformer models, What do you / w3c call older LLMs like;
Babelnet, cyc, FrameNet, Framester, Lemon, linguistlist, SUMO, WordNet,
etc..."

Response is:

The technologies you've mentioned, such as BabelNet, Cyc, FrameNet,
Framester, Lemon, Linguist List, SUMO (Suggested Upper Merged Ontology),
and WordNet, are generally categorized under the umbrella of Natural
Language Processing (NLP) tools and resources, rather than being classified
as Large Language Models (LLMs) like GPT or Transformer models. These older
systems are often referred to by more specific terms based on their
function and structure:

*Lexical Databases or Ontologies:* WordNet is a prime example of a lexical
database, while SUMO is an ontology. These systems provide structured
information about words, concepts, and their relationships.

*Semantic Networks:* BabelNet can be considered a semantic network as it
integrates lexical, conceptual, and ontological information.

*Knowledge Bases or Knowledge Graphs:* Cyc is an example of a knowledge
base, which is a system that stores information in a structured form, often
used for AI and NLP applications.

*Linguistic Frameworks*: FrameNet is a project that catalogs lexical and
semantic information based on frame semantics. It's more of a linguistic
resource than a computational model.

*Linked Data and Ontology Frameworks:* Framester and Lemon (Lexicon Model
for Ontologies) are related to linked data and ontology frameworks. They
provide ways to integrate linguistic information with semantic web
technologies.

*Linguistic Resources or Databases:* The Linguist List falls under the
category of linguistic resources, providing comprehensive references for
linguistic information.

The distinction between these resources and LLMs like GPT or Transformer
models lies in their function and underlying technology. The older systems
are primarily focused on structured representation of linguistic or
ontological information, often relying on manually curated data or
rule-based approaches. In contrast, LLMs use machine learning (especially
deep learning) to generate or understand natural language, relying on
large-scale data and neural network architectures.

In summary, while the W3C (World Wide Web Consortium) and other entities
involved in web standards and technologies don't have a specific collective
term for these older systems, they are typically referred to based on their
specific function (e.g., lexical databases, ontologies, knowledge bases) in
the broader field of NLP and AI.

___________________________________________________________________________________________________________________________________


So, if i want to make a clear distinction between an LLM (that requires an
enormous amount of compute power to operate 'privately'); and the means to
create AI resources, by updating stuff like BabelNet, Cyc, FrameNet,
Framester, Lemon, Linguist List, SUMO (Suggested Upper Merged Ontology),
and WordNet to support spatio-temporal 'ai datasets' that can be used
locally (offline, although offline does have impacts / implications) with
limited hardware resources; then, what should that field of work be called?

Perhaps 'knowledge base(s)'??

I've started a new document for 'definitions':
https://docs.google.com/document/d/1ZEBQ4zs-yb4idpIqR06vvGhOX83G0P6LK3YeD12NwPU/edit?usp=sharing


It would be great if we could work on defining a range of terms that have
express meanings defined via URIs, etc. That could be shared across
multiple groups / venues, etc.

I expect some should be easier than others...  Web 3.0 & "human centric ai'
perhaps being amongst the more difficult ones to define atm, imho..

Tim.H.


On Tue, 14 Nov 2023 at 18:35, Timothy Holborn <timothy.holborn@gmail.com>
wrote:

>
>
> FWIW: I've started on:
>
>
> https://docs.google.com/document/d/17VHMpoIXbBT6AXkspPDkHfoTi80GL6OwiAPs9VrHVbY/edit?usp=drivesdk
>
> But it'll be updated ALOT over the coming days.  It would be good if
> others also put together their thoughts, then perhaps we could socialise
> considerations..  if wanted.
>
> NB also: I really like your slides, the presentation quality is great.
>
> Best.
>
> Tim.h.
>
>
>
>
> On Tue, 14 Nov 2023, 6:28 pm Dave Raggett, <dsr@w3.org> wrote:
>
>> Naming is often referred to as bike shedding …
>>
>
> https://en.wikipedia.org/wiki/Law_of_triviality
>
> https://wikidata.org/wiki/Q169899
>
>
>> We need names, but it often takes up a lot of time better spent on other
>> things.
>>
>
> Yeah, does take alot of time..
>
>
>
>> On 13 Nov 2023, at 19:27, Timothy Holborn <timothy.holborn@gmail.com>
>> wrote:
>>
>> quick questions;
>>
>> If LLMs are now defined as stuff like GPT / Transformer models,
>>
>> What do you / w3c call older LLMs like; Babelnet, cyc, FrameNet,
>> Framester, Lemon, linguistlist, SUMO, WordNet, etc...
>>
>> Also, is web 3.0 still web 3.0?  it seems to have been adopted in some
>> way, which kinda leaves open the opportunity to define internet 3.0 ;)
>>
>> whilst the 'web 3.0' stuff isn't so important atm, the issue about 'large
>> language models;' vs. transformer models, or however the terminology should
>> now be categorised; which thereby also leads to various issues about how
>> the term 'ai' is defined, but starting small....
>>
>> https://www.w3.org/TR/wordnet-rdf/  and all such things seemed to be
>> fairly big jobs.  Whilst I think they need to be updated to support
>> spatio-temporal notations, vector stuff, etc... (resources for logical
>> programming, Computational geometry, et.al. imho, ) It seems there's a
>> problem with being clear about terms and their meanings...
>>
>> FWIW: Figure it's probably an important issue to raise in relation to W3C
>> activities generally.
>>
>> cheers,
>>
>> tim.h.
>>
>>
>> On Mon, 13 Nov 2023 at 20:17, Timothy Holborn <timothy.holborn@gmail.com>
>> wrote:
>>
>>> K.
>>>
>>> I'll follow-up when I've put together some thoughts, following more
>>> research.
>>>
>>> Cheers.
>>>
>>> Tim.h.
>>>
>>> On Mon, 13 Nov 2023, 8:11 pm Dave Raggett, <dsr@w3.org> wrote:
>>>
>>>>
>>>>
>>>> On 13 Nov 2023, at 10:37, Timothy Holborn <timothy.holborn@gmail.com>
>>>> wrote:
>>>>
>>>> Hi Dave,
>>>>
>>>> I wondered if you had any thoughts on the potential application of your
>>>> CogAI work to create a semantic/CogAI filesystem?
>>>>
>>>>
>>>> Yes, and in conjunction with large language models.  I see the
>>>> combination (something I refer to as cognitive databases) as an
>>>> evolutionary replacement for today’s graph databases, and paving the way to
>>>> zero-code applications. I see this as enabling collaborative knowledge
>>>> engineering (see my talk [1]) where the computer and human play
>>>> complementary roles.  The human can ask the computer to perform an analysis
>>>> of the data, leaving the details to the computer to figure out for itself.
>>>>
>>>> The huge hype around generative AI has drawn attention away from the
>>>> limitations of the current approaches to large language models.  I see many
>>>> opportunities for evolving artificial neural networks inspired by what we
>>>> know about human cognition. This will enable computers to become very
>>>> effective partners for human-machine collaborative work.
>>>>
>>>> Neural networks and vector spaces are very powerful for representing
>>>> complex statistical relationships, as a generalisation of symbolic
>>>> knowledge, and learned and queried by machine. This changes the conception
>>>> of databases considerably. The impact of this will be dramatic.
>>>>
>>>> [1] http://www.w3.org/2023/10/10-Raggett-AI.pdf
>>>>
>>>>
>>>> I was looking at the solid code again recently, and was thinking about
>>>> the implementation structure.
>>>>
>>>> I haven't done much more thinking about it yet, but thought I'd ask you
>>>> just in case.
>>>>
>>>> Whilst my plans for human centric AI stuff, particularly my thoughts on
>>>> my own implementation, may not be solid, although I want compatibility /
>>>> backwards compatibility...
>>>>
>>>> It seemed to me that there's an ability to make distinctions between
>>>> different agents using multiple "pods" via methods that assume domain
>>>> ownership.
>>>>
>>>> Without getting into it further, the basic thought was about semantic
>>>> file systems which then led to wondering about CogAI file systems...  How
>>>> that might work?
>>>>
>>>> Tim.H.
>>>>
>>>>
>>>>
>>>>
>>>> Dave Raggett <dsr@w3.org>
>>>>
>>>>
>>>>
>>>>
>> Dave Raggett <dsr@w3.org>
>>
>>
>>
>>

Received on Tuesday, 14 November 2023 09:49:32 UTC