Re: Blockchain Commons Known Values Registry: Compact Integer Identifiers for Ontological Concepts

út 3. 2. 2026 v 23:51 odesílatel Christopher Allen <
ChristopherA@lifewithalacrity.com> napsal:

> *TL;DR:* I'm seeking CCG community input on a compact identifier registry
> we've developed at Blockchain Commons. Our Known Values Registry
> (BCR-2023-002) maps ontological concepts — predicates, classes, properties
> — to 64-bit integers, providing a compact binary representation while
> preserving semantic meaning.
>
> We've already mapped several vocabularies this community uses (RDF, RDFS,
> Dublin Core, FOAF, SKOS, Verifiable Credentials, Schema.org), and we're
> developing new schemas for areas like principal authority, signature
> context, and peer endorsements.
>
> *Three questions for the community:*
>
> - Are there other ontologies or vocabularies CCG uses that we should
> prioritize mapping?
>
> - Would schemas for principal authority (who directed vs who performed),
> signature context (the capacity in which someone signs), and peer
> endorsements be useful for VC implementations?
>
> - Is anyone working on similar compact-identifier approaches?
>
> Here's the detail on what we've built:
>
> *The Known Values Registry*
>
> BCR-2023-002 defines a namespace of 64-bit unsigned integers representing
> ontological concepts — relationships, classes, properties, and enumerated
> values. Each integer maps to a canonical name and equivalent URIs.
>
>
> https://github.com/BlockchainCommons/Research/blob/master/papers/bcr-2023-002-known-value.md
>
> We needed compact binary representation and deterministic encoding, but
> the registry itself is independent of any particular encoding. While we
> serialize these as CBOR (#6.40000) for use with Gordian Envelope, the
> codepoint-to-concept mappings stand alone and can be used in any format or
> protocol.
>
> For example, rdf:type (codepoint 1) encodes as:
>
>     CBOR diagnostic: 40000(1)
>     Bytes: d9 9c 40 01  (4 bytes)
>
> Compare that to the 47-byte URI "
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type". For documents with many
> predicates, this adds up.
>
> *What's Already Mapped*
>
> We've assigned codepoints for several vocabularies this community uses:
>
> - RDF (2000-2049): 21 entries
> - RDFS (2050-2099): 15 entries
> - OWL 2 (2100-2199): 75 entries
> - Dublin Core Elements (2200-2299): 15 entries
> - Dublin Core Terms (2300-2499): 89 entries
> - FOAF (2500-2699): 75 entries
> - SKOS (2700-2799): 32 entries
> - Solid (2800-2899): 33 entries
> - W3C Verifiable Credentials (2900-2999): 28 entries
> - GS1 Web Vocabulary (3000-3999): 609 entries
> - Schema.org (10000-19999): 2450 entries
>
> These are 1:1 mappings — the Known Value codepoints reference the
> canonical URIs from each ontology.
>
> *Emerging Schemas*
>
> We're also developing predicates for areas where we haven't found existing
> schemas to leverage. These are currently in community review (see the
> current PRs in the repository):
>
> - Principal Authority — predicates for expressing who directed a work vs
> who performed it (e.g., human holds principalAuthority over AI-generated
> content)
>
> - Signature Context — the capacity in which someone signs (e.g., CFO signs
> onBehalfOf their corporation, not personally)
>
> - Fair Witness — neutral third-party observation attestations (e.g.,
> notary attesting they observed a signature ceremony)
>
> - Peer Endorsement — skill and collaboration endorsements distinct from
> formal credentials (e.g., colleague endorsing another's security expertise
> based on project work)
>
> - CreativeWork Roles — contribution roles mapped to CRediT with ONIX, MARC
> (e.g., distinguishing Author from Editor from Reviewer on a collaborative
> work)
>
> I'm planning to make these available as schemas useable with JSON-LD and
> other formats at https://assertions.info for those working outside
> CBOR/Envelope contexts, if the W3C CCG community finds them useful.
>
> *Community Registry*
>
> Codepoints 100,000+ are open for community registration via automated
> GitHub workflow — submit a JSON file, validation runs, and upon merge you
> have registered codepoints. No gatekeeping beyond schema conformance and
> uniqueness checks.
>
> *Resources*
>
> Full registry with JSON exports:
>
>
> https://github.com/BlockchainCommons/Research/tree/master/known-value-assignments
>
> We also presented on Known Values at our January Gordian Community meeting:
>
> Video:
>     https://youtu.be/FiLNhx9BOuk?t=2658 (Known Value discussion starts at
> 44:18)
>
> Transcript:
>
> https://developer.blockchaincommons.com/meetings/2026-01-gordian/transcript/#known-values-discussion
>

Mapping URIs to integers saves a few hundred bytes for typical documents,
far less than general-purpose compression delivers for free, without
requiring all implementations to remain synchronized against a centrally
managed registry. When systems inevitably drift, integer codepoints fail
silently (the same number meaning different things), whereas URIs fail
loudly. The proposed ontology work on principal authority and peer
endorsement may have merit, but bundling it with a bespoke compression
mechanism couples two unrelated design decisions and makes both harder to
evaluate on their own terms.


>
>
> -- Christopher Allen
>    Blockchain Commons
>

Received on Tuesday, 3 February 2026 23:33:56 UTC