Executable Atomic Units for Multi‑Language KR (K3D case study)

Milton, Paola, Dave, semantic‑web all,

(cross‑posted to public-aikr@w3.org and semantic-web@w3.org)

thank you, Paola, for sharing the DCMI 2025 talk and the screen 
recording. Watching them together makes the AI‑KR trajectory quite 
clear: mapping the KR domain (upper ontologies, KR languages/formalisms, 
KRL, reliability, AI safety) and building vocabularies that can act as 
subject‑index metadata over that conceptual space.

I fully agree that we need a shared language for KR and KRL, and that 
concepts like truth maintenance/preservation and reliability engineering 
have been under‑represented in AI safety standards. Where my work (K3D) 
comes at this from a different angle is that I’ve been operating one 
layer under vocabularies—what I sometimes call the “meaning‑sphere” 
versus the “word‑sphere”:

The vocabularies you’re building curate words and terms for KR and KRL.
As I shared in my last note, we’ve just completed a small training run 
in K3D to construct such atomic elements explicitly, without relying on 
natural‑language tokenization as the primary substrate.

Cross‑modality is compositional: we store visual and mathematical 
programs in the same atomic unit and retrieve that composite object, 
rather than projecting everything into a single token embedding space.

There is*no natural‑language tokenization step in the LLM sense*; *form 
and meaning live in separate, well‑defined program domains* (visual RPN 
and execution RPN), *with natural language sitting on top rather than 
being the primary representation*. *Fusion happens via the 3D contract 
(the “star”)*, not by collapsing everything into one NL vector space.

For those who asked earlier for state‑of‑the‑art context: this line of 
work sits in the same space as current neuro‑symbolic / KR literature 
(e.g. methodological frameworks for symbolic/NSI reasoning and 
verification, surveys on neuro‑symbolic knowledge integration), the 
trustworthiness/terminology baselines in ISO/IEC 22989:2022, and recent 
Green AI work on lifecycle efficiency and carbon impact.

The atomic‑unit construction above is just one concrete instantiation of 
those ideas for a small visual/math domain of discourse.

Looking ahead, this atomic‑unit validation is just a first step. The 
same construction scales naturally from ASCII+math to full Unicode, full 
character set across multiple scripts (Latin, CJK, Arabic, Devanagari, 
indigenous scripts, etc.), with script‑specific visual RPN families but 
the same set‑theoretic pattern for atomic units.

The goal is true multi‑language KR at the character level, including the 
“invisible giants” of low‑resource and indigenous languages that 
tokenization‑based LLMs systematically underserve: each writing system 
gets explicit, executable atoms for form and meaning, rather than being 
squeezed through an English‑centric tokenizer.

*On the openness point*: in a public Web context, *any open 
work*—whether vocabularies, diagrams, or code—*will be reused by others*.

That’s the *nature of the medium*, *especially in a world where our 
email and collaboration tools are themselves mining data in the 
background*.

*My way* of dealing with this *is to lean into clear licensing and 
attribution rather than tight access control*: K3D is Apache‑2.0 for 
code, CC‑BY‑4.0 for docs, and designed to run locally on mid‑range 
hardware.

The long‑term aim is for K3D to be to the spatial / neuro‑symbolic web 
what Apache was to the hypertext web: a concrete, inspectable 
implementation that anyone can run, study, and build on, whether or not 
they ever talk to me.

To make it easier to explore the intersection between AI‑KR, Semantic 
Web work, ISO terminology, Green AI, and K3D, I’ve assembled a public 
NotebookLM workspace that aggregates:

AI‑KR wiki pages, reports, and mail threads (via the public list), KR / 
NSI / Green AI papers and blog posts, ISO/IEC 22989 commentary, and the 
K3D technical repo link.

You can find it here:

https://notebooklm.google.com/notebook/80d00386-4b7d-4893-ae84-1c5f90c223de

*NotebookLM auto‑builds* mind maps, quizzes, a video overview, an 
audio/podcast‑style overview, summary reports, *and offers a central 
chat where you can ask a Gemini model questions grounded by the 
collected sources (RAG)*.

It’s just a shared research aid, not any kind of “official” document.

If Paola, Carl, or anyone from AI‑KR or semantic‑web chairs would like 
editor rights (e.g., to add vocab drafts, slides, or additional 
references), I’m happy to add you; that way, even if you don’t have time 
to read everything manually, you can still query and explore the 
landscape in one place.

 From my side, I see these efforts as complementary:

AI‑KR vocabularies and concept maps live in the “word‑sphere”, helping 
people talk precisely about KR, KRL, and AI safety.

K3D lives in the “meaning‑sphere”, where those words line up with 
explicit, executable structures and domains of discourse that humans and 
machines can both navigate and inspect—locally, on real hardware, today.

If the CGs ultimately decide that this implementation layer is out of 
scope, that’s perfectly fine; K3D will continue as an open case study.

But given the shared focus on knowledge representation, learning, 
reliability, and Web semantics, I hope the atomic‑unit work and the 
broader spatial KR architecture can serve as one concrete example of how 
vocabulary‑level work (AI‑KR, semantic‑web) can be grounded in running 
systems that are explainable, efficient, and accessible by design.

Best regards,
Daniel


PS: Paola I forgot to include the AI generated subtitles and from that 
the extracted text line by line files to aid you, you'll find them 
attached (SRT for time and text - video subtitle, TXT only text no time, 
DOCX similar to SRT).

Received on Wednesday, 19 November 2025 19:15:24 UTC