RE: Semantics and Embedding Vectors from Adam Sobieski on 2022-10-12 (semantic-web@w3.org from October 2022)

From: Adam Sobieski <adamsobieski@hotmail.com>
Date: Wed, 12 Oct 2022 02:50:04 +0000
To: Christian Chiarcos <christian.chiarcos@web.de>
CC: "semantic-web@w3.org" <semantic-web@w3.org>
Message-ID: <PH8P223MB0675C9AB47F82E3A0A4CA91AC5229@PH8P223MB0675.NAMP223.PROD.OUTLOOK.COM>

Christian,
All,

Also, with respect to SPARQL and its support for first-class vectors, while exploring the SPARQL 1.2 Community Group’s GitHub issues, I also found:

Issue #130: Easier addition of support for custom datatypes to SPARQL endpoints (https://github.com/w3c/sparql-12/issues/130).

This issue’s discussion’s participants mentioned “Scientific SPARQL” or “SciSPARQL” having support for matrices and tensors. This extended version of SPARQL is broached in:

Andrejev, Andrej, and Tore Risch. "Scientific SPARQL: semantic web queries over scientific data." In 2012 IEEE 28th International Conference on Data Engineering Workshops, pp. 5-10. IEEE, 2012. (https://ieeexplore.ieee.org/document/6313648)

Abstract: “We define an extended version of the Semantic Web query language SPARQL called Scientific SPARQL, SciSPARQL. It is targeted mainly at scientific computing and laboratory data management. SciSPARQL includes expressions, numeric multi-dimensional array operations, user-defined functions, aggregate functions, and function views. A prototype system translates SciSPARQL to a Datalog dialect which is extensible by external functions implemented in a regular programming language. The system automatically recognizes collections in RDF Turtle statements that represent numerical multi-dimensional arrays in order to represent them with a special native data type. A back-end relational database provides persistent storage.”


Best regards,
Adam

From: Adam Sobieski
Sent: Tuesday, October 11, 2022 12:12 PM
To: Christian Chiarcos <christian.chiarcos@web.de>
Cc: semantic-web@w3.org
Subject: RE: Semantics and Embedding Vectors

Christian,
All,

Thank you for the information about FrAC. It appears that, with it, we can anchor concepts/items to static and contextualized embedding vectors.

Perhaps, someday, we’ll be able to do mathematics, e.g., dot products and lengths, with first-class vectors in SPARQL.

Today, technologies pertaining to extensible expressions and functions in SPARQL include: http://ns.inria.fr/sparql-extension/ .


Best regards,
Adam

From: Christian Chiarcos <christian.chiarcos@web.de<mailto:christian.chiarcos@web.de>>
Sent: Monday, October 10, 2022 11:45 AM
To: Carlos Bobed <cbobed@unizar.es<mailto:cbobed@unizar.es>>
Cc: Chris Harding <chris@lacibus.net<mailto:chris@lacibus.net>>; adamsobieski@hotmail.com<mailto:adamsobieski@hotmail.com>; semantic-web@w3.org<mailto:semantic-web@w3.org>
Subject: Re: Semantics and Embedding Vectors

As for the definition of the embedding space, this is currently only indirectly covered by FrAC, in that it should be provided via frac:corpus (data source) and in human-readable form in dc:description (construction method).

Am Mo., 10. Okt. 2022 um 17:43 Uhr schrieb Christian Chiarcos <christian.chiarcos@web.de<mailto:christian.chiarcos@web.de>>:
But ... it's really interesting, if you give me a concept / item, a
defined (and shared/accessible) embedding space, and the context of the
item for that meaning; that would be a very good anchor to grasp the
definition of the item.

Yes. This is exactly what frac:Attestation gives you, the context of the observable (= the item for that meaning), so contextualized embeddings are (in SPARQL) something like:

?observable frac:attestation [ frac:locus your:string_url_for_context; rdf:value "your ... context ... string" ; frac:attestationEmbedding [ rdf:value ?contextualized_embedding  ] ] .

And static embeddings are

?observable frac:embedding [ rdf:value ?contextualized_embedding  ] .

(Plus medata).

Best,
Christian

Received on Wednesday, 12 October 2022 02:50:18 UTC