Re: SemWeb + LLMs etc, here or a new group? from Danny Ayers on 2023-09-26 (semantic-web@w3.org from September 2023)

From: Danny Ayers <danny.ayers@gmail.com>
Date: Tue, 26 Sep 2023 14:59:19 +0200
To: Melvin Carvalho <melvincarvalho@gmail.com>
Cc: Semantic Web <semantic-web@w3.org>, Dan Brickley <danbri@danbri.org>
Message-ID: <CAM=Pv=TMoU=VpmEqYRWdOZARhAdJ8m5Br8Yy5FLpYyUXrCtdAA@mail.gmail.com>
Hiya Melvin. Yeah, I think you're right about the practical use cases being
hard to pin down. But I suspect this is an occasion when the tech appears
first, its utility only later, praxis or whatever.

Look at the way folks in the dev community/industry at large are scurrying
around looking for applications of GPT they can monetize. I've lost count
of the number of '#1 AI Coding Assistant' apps I've seen.

(Incidentally I heard on the radio about a new book suggesting that
Neanderthal homos may have been a lot more creative than us, few of their
stone tools share a pattern unlike our clunky copied efforts of the same
period. Their thinking patterns might have been useful now.).

On the web tech side, I reckon you hit the nail on the head in mentioning
the follow-your-nose protocol. As with the web already, it's that where the
real value comes from, specific formats etc. being at best secondary. I
just found out that Claude can gobble fairly sizeable quantities of docs,
10k context window or somesuch?
That's going to seem nothing in a few years.

But a value point of using typed links etc is that it can provide a faster
route to more relevant info. So (hands beginning to wave) although you can
make the machines smarter with sheer bulk of data, RDF & co offer a leaner,
more efficient kind of discovery. A turbo button, if you will.

From a practical point of view, at this point in time a volume-based
approach is almost certainly going to produce good results faster (I am not
an analyst but it also seems where most the funding is being directed :
both MS & Amazon appear to be tying their efforts to their existing Big
Data/big cloud systems).

But we (broad circular hand gesture) have experience on how the web
does/can operate, have a lot of proven tools (formalizations, specs, all
the way down to runnable code) in this domain.
TL;DR - just need to glue it all together.

Cheers,
Danny.


On Tue, 26 Sept 2023, 04:04 Melvin Carvalho, <melvincarvalho@gmail.com>
wrote:

>
>
> út 26. 9. 2023 v 3:01 odesílatel Danny Ayers <danny.ayers@gmail.com>
> napsal:
>
>> Something big & new has arrived, but it is at a tangent to regular
>> business, so maybe a new Community Group/list or whatever should be
>> considered. I don't know what folks think about boundaries. Let me tell you
>> my story...
>>
>> I'm asking because I've been playing with it a bit, one specific angle,
>> using LlamaIndex to use Graph Retrieval Augmented Generation against
>> OpenAI's GPT API. Sorry, I haven't links at hand, but the papers on RAG and
>> Graph RAG, Graph of Thoughts are on arXiv. I have very naive code that runs
>> at :
>>
>> https://github.com/danja/llama_index/blob/main/docs/examples/graph_stores/graph-rag-sparql-mini.py
>>
>> (Isn't ego great? Found my own thing immediately).
>>
>> My immediate conclusions are that conceptually it's a no--brainer to
>> attach such systems to Linked Data (naturally a moderate pain in practice).
>> LLMs expect verbals, so you give them, them. A RAG has RDF graph URLs as
>> pointers, it looks them up (HTTP, HTTP), chases the schema definition,
>> pulls in the definition of the property or class, it has a sentence
>> comparable to the texts it's been trained on. It's very floppy, but I
>> believe potentially useful.
>>
>> (I spent a long time bugged by this - surely we can just give URIs to the
>> LLM as some kind of first class token? I still haven't a clue, but for now
>> there are easier ways in).
>>
>> I accidentally came up with a TED Talk-style analogy that might work
>> for the big picture. For something unrelated I typed in 'warp start' when I
>> meant 'yarn start'. How I giggled! But yeah, the Web (very strongly
>> including Linked Data, as much OWLishness as you want) is a clear Warp,
>> where the AI bits can fill it out with Weft of information fabric.
>> (Apologies to Tim re. book-naming).
>>
>> So yeah, in a rambly way, do you see why I think another group is
>> something to bear in mind? Personally I'm happy either way, as long as the
>> W3C tries to keep their eye on the ball. Blockchain Web3, maybe not so
>> much. But LLMs, I'd say in scope, here or somewhere parallel.
>>
>
> LLMs are useful.  Perhaps early versions of Enquire were ahead of their
> time.
>
> However, they can equally use plain text, JSON, 1-5 star linked data, and
> RDF.
>
> If we were to call RDF 5* linked data, in what use cases would that give
> you an advantage over, say 1-4.5* linked data?
>
> Perhaps when full follow-your-nose capabilities are added it may yield
> some interesting results.
>
> But I have yet to figure out use cases for RDF that *significantly* out
> perform text analysis, or a website with schema.org sprinkles.
>
>
>>
>> Cheers,
>> Danny.
>>
>>
>>
>>
>>
>>
>>
>>
>> --
>> ----
>>
>> https://hyperdata.it <http://hyperdata.it/danja>
>>
>>
Received on Tuesday, 26 September 2023 12:59:39 UTC