Kristals v3 spec (zip) — verifiable offline knowledge packs (Wikidata/RDF evolution) + v4 upgrade before integration

Hello,

I’m sending the *Kristals v3 spec bundle* (attached .zip). Kristals are a
practical evolution of the *RDF/Wikibase/Wikidata* model into a modern
distribution artifact: *verifiable, content-addressed knowledge packs* that
run offline and stay reproducible across toolchains.
What Kristals are

A *Kristal* is a compiled knowledge unit (not a document, not free text).
Each release produces:

   -

   *Kristal Exchange* — canonical, auditable “source of truth” for
   validated statements
   (Wikibase-shaped: QIDs/PIDs, typed values, qualifiers,
   references/evidence)
   -

   *Kristal Runtime Pack* — derived, *offline-executable* indexed form
   (predictable constrained queries; no SPARQL endpoint, no network
   dependency, no LLM dependency)

Pipeline boundary (strict):
*Claim-IR (schema proposals with uncertainty + evidence) → resolution →
deterministic validation (“no compile on fail”) → Exchange → Runtime Pack →
deterministic rendering (no new facts).*
Why this matters (AI + systems)

Kristals are designed as a high-signal substrate for AI and data systems:

   -

   strict schema boundaries (no “free text becomes truth”)
   -

   evidence + uncertainty are first-class
   -

   stable IDs enable dataset versioning and reproducible experiments
   -

   offline packs enable low-latency retrieval and edge deployments

What v3 locks in

   -

   *Normative canonical JSON*: JCS (RFC 8785) for portable content
   addressing
   -

   *Fail-closed integrity*: declared hashes/signatures must verify or
   consumers hard-fail
   -

   *Reproducible runtime packs*: portable, recorded policy selections
   (ordering, row-groups, bitmap conventions, membership filters)
   -

   Optional profiles: *JSON-LD / RDF exports*, *RDFC integrity* (limits +
   CI gating), *PROV-O/nanopubs*, *SHACL/ShEx*, *TPF-like pagination*

Where this is integrated first (production)

Kristals are being integrated across *Konnaxion × Orgo × Architect ×
SenTient*:

   -

   *Orgo* orchestrates ingest → extract → resolve → validate → publish;
   audits + distribution status
   -

   *SenTient* reconciles surfaces → ranked QIDs/PIDs; normalizes values;
   preserves ambiguity
   -

   *Konnaxion* distributes Runtime Packs for offline search/navigation and
   low-bandwidth UX
   -

   *Architect* renders deterministic multilingual text from validated
   knowledge with full traceability

What I want from you (v4 upgrade before I freeze production)

I’m collecting technical review and support to upgrade this into *v4*
before integration is frozen.

I want direct judgment on:

   1.

   Is the *normative core* tight and unambiguous enough for
   interoperability?
   2.

   Do the reproducibility rules avoid “rebuildable but incomparable” packs?
   3.

   Is the offline query surface (TPF-like pagination profile) correctly
   scoped and stable?
   4.

   Any non-obvious pitfalls in canonicalization/hashing/signing and
   deterministic Parquet/index construction?




------------------------------

Recipient-specific notes
Daniel Lemire

Your work on Roaring bitmaps and membership structures maps directly onto
runtime-pack indexing. I want your judgment on portable defaults, which
parameters must be recorded for reproducibility, and comparability across
implementations.
Ruben Verborgh

I’m implementing a constrained offline query surface inspired by TPF
(cursor paging, stable ordering, cache-friendly responses). I want your
assessment of cursor semantics and the boundaries needed to keep it
composable without drifting into SPARQL semantics.
HDT authors (Fernández / Martínez-Prieto / Gutiérrez / Polleres / Arias)

This targets the same objective as HDT—compact, distributable, queryable
RDF-class knowledge—while adding a reproducible “pack + manifest” layer and
offline execution constraints. I want your critique on where this should
converge with HDT ideas vs where divergence is correct.
W3C lists (JSON-LD / SHACL / RCH / PROV)

I’m using these specs as explicit profiles (not core requirements). I want
feedback on profile boundaries (what is covered/hashed), conformance
language, and practical resource limits—especially for RDFC.
Google / Microsoft research routing

Route this to the right teams (knowledge graphs, data management,
verifiable data, offline/edge search). I’m seeking technical review and v4
upgrade input.
Wikimedia Research / contacts

This is an operational evolution of Wikidata-class knowledge distribution:
verified, portable, offline packs. I want feedback on ecosystem fit,
model/interop concerns, and what would make this useful at scale.
Apache Parquet / Arrow lists

I’m constraining to a small enumerated set of output policies for
deterministic, comparable packs. I want concrete guidance on determinism
pitfalls (ordering/stats/encoding) and what must be recorded to make
rebuilds and verification rigorous.

Truly,
Réjean McCormick

Socio-Technical Architect
kOA
okido.wiki
https://github.com/Rejean-McCormick?tab=repositories

Received on Tuesday, 13 January 2026 11:23:42 UTC