Semantics and Embedding Vectors

Semantic Web Interest Group,

Embedding vectors can represent many things: words [1], sentences [2], paragraphs, documents, percepts, concepts, multimedia data, users, and so forth.

A few months ago, I started a discussion on GitHub about formal ontologies for describing these vectors and their models [3]. There, I also indicated that MIME types for these vectors could be created, e.g., "embedding/gpt-3" or "vector/gpt-3".

For discussion and brainstorming, I would like to share some ideas with the group.

Firstly, we can envision machine-utilizable lexicons which, for each sense of each lexeme, include, refer to, or hyperlink to embedding vectors.

Secondly, we can envision that metadata for scholarly and scientific publications might, one day, include sets of embedding vectors, e.g., each representing a topic or a category from a scholarly or scientific domain, or that these publications might include sets of URI's or text strings from controlled vocabularies, each URI or term related elsewhere to embedding vectors.

Is there any interest, here, in formal ontologies which describe embedding vectors and their models? Do any such ontologies already exist? Any thoughts on these topics?


Best regards,
Adam Sobieski

[1] https://en.wikipedia.org/wiki/Word_embedding
[2] https://en.wikipedia.org/wiki/Sentence_embedding
[3] https://github.com/onnx/onnx/discussions/4318

Received on Sunday, 9 October 2022 07:07:25 UTC