- From: Daniel Campos Ramos <danielcamposramos.68@gmail.com>
- Date: Wed, 19 Nov 2025 14:23:13 -0300
- To: public-aikr@w3.org, semantic-web@w3.org
- Message-ID: <c2fda3a5-5ea4-4b0e-ba41-e614b99c610c@gmail.com>
Milton, Dave, all, Following Milton’s point that “the core elements must be well defined” and that this is missing from the blue bubbles, we’ve just completed a small training run in K3D to construct such atomic elements explicitly, without tokenization. Very briefly, for a restricted domain (ASCII + math glyphs), we define: F = executable visual RPN programs (form space) M = execution/semantic RPN programs (meaning space) E = procedural embedding space ℝ^D and build atomic units as: A = (c, f, m, e) with c ∈ Σ, f ∈ F, m ∈ M, e ∈ E In the current run we implemented: 148 atomic units total 72 dual‑program “stars” where each character has both: a visual RPN program that actually renders the glyph on GPU, and an execution RPN/bytecode program (e.g. e as Euler’s number, + as ADD, ^ as POW) Cross‑modality here is compositional: we store visual and mathematical programs in the same atomic unit and retrieve that composite object, rather than projecting everything into a single token embedding space. There is no natural‑language tokenization step in the LLM sense; form and meaning live in separate, well‑defined program domains (visual RPN and execution RPN), with natural language sitting on top rather than being the primary representation. Fusion happens via the 3D contract (the star), not by collapsing everything into one natural‑language vector space. A short write‑up of this proof‑of‑concept, including the set‑theoretic definitions, metrics (148 units, 72 dual‑program, ~2 minutes training, ~2.2KB per unit), and example stars for e, +, and ^, is here: https://github.com/danielcamposramos/Knowledge3D/blob/main/TEMP/W3C_AIKR_ATOMIC_UNITS_PROOF_NOV19.md For those who previously asked for state‑of‑the‑art context: this line of work is consistent with current neuro‑symbolic and KR literature, e.g. methodological frameworks for symbolic/NSI reasoning and verification, recent surveys of neuro‑symbolic knowledge integration, and the trustworthiness/terminology baselines in ISO/IEC 22989:2022, as well as recent Green AI work on efficiency and lifecycle impact. The proof‑of‑concept above is just one concrete instantiation of those ideas for a small visual/math domain of discourse. Looking ahead, this atomic‑unit validation is just the first step. The same construction A = (c, f, m, e) scales naturally from ASCII+math to full Unicode: Phase 3 on our side is to extend Ω_implemented from 148 units to the full character set *across multiple scripts (Latin, CJK, Arabic, Devanagari, indigenous scripts, etc.), with script‑specific visual RPN families but the same set‑theoretic pattern for atomic units*. The goal is to *support true multi‑language KR at the character level*, including the “invisible giants” of low‑resource and indigenous languages that current tokenization‑based LLMs systematically underserve: *each writing system gets explicit, executable atoms for form and meaning*, rather than being squeezed through an English‑centric tokenizer. To make it easier to explore these connections, I’ve also assembled a public NotebookLM workspace that aggregates the main public AI‑KR web sources (AI‑KR wiki and reports, related KR/NSI papers, ISO/IEC 22989 material, Green AI work, StratML and the K3D repo link): https://notebooklm.google.com/notebook/80d00386-4b7d-4893-ae84-1c5f90c223de The notebook on the group work includes automatically generated mind maps, quizzes, a video overview, an audio/podcast‑style overview, summary reports, and a central chat window where you can discuss the collected sources with a Gemini model. It’s intended purely as a shared research aid, not an official document. If Paola, Carl, or any other CG participant would like editor access to extend or correct it (e.g., by adding more vocab drafts or references), I’m happy to add you. This is still very early and deliberately narrow in scope (one small domain of discourse), but I hope it’s a useful concrete example in the space you’re both describing: Milton’s requirement for constructible atomic elements and domains of discourse; Dave’s emphasis on structured but not purely formal KR, where plausible reasoning layers (PKN‑style) can sit on top of explicit, machine‑readable foundations. If anyone is interested in the implementation details, I’m happy to take that to a separate thread or offline so we don’t overload this one. Best regards, Daniel
Received on Wednesday, 19 November 2025 17:23:21 UTC