Re: Blockchain Commons Known Values Registry: Compact Integer Identifiers for Ontological Concepts

Hi,
CBOR-LD [1] uses shared dictionaries to improve compression ratios. These
dictionaries can be generated from JSON-LD contexts or provided externally,
and this information is encoded in the final CBOR-LD output.

Why maintain a fixed registry for compact identifiers? Are these compact
identifiers, which represent ontological terms, intended to be used
standalone, or embedded within a larger representation? In the second case,
a fixed registry may be unnecessary; it could be replaced with a
dereferenceable "context" that maps terms to integers.

Best regards,
Filip
https://www.linkedin.com/in/filipkolarik/

[1] https://github.com/filip26/iridium-cbor-ld



On Wed, Feb 4, 2026 at 12:36 AM Melvin Carvalho <melvincarvalho@gmail.com>
wrote:

>
>
> út 3. 2. 2026 v 23:51 odesílatel Christopher Allen <
> ChristopherA@lifewithalacrity.com> napsal:
>
>> *TL;DR:* I'm seeking CCG community input on a compact identifier
>> registry we've developed at Blockchain Commons. Our Known Values Registry
>> (BCR-2023-002) maps ontological concepts — predicates, classes, properties
>> — to 64-bit integers, providing a compact binary representation while
>> preserving semantic meaning.
>>
>> We've already mapped several vocabularies this community uses (RDF, RDFS,
>> Dublin Core, FOAF, SKOS, Verifiable Credentials, Schema.org), and we're
>> developing new schemas for areas like principal authority, signature
>> context, and peer endorsements.
>>
>> *Three questions for the community:*
>>
>> - Are there other ontologies or vocabularies CCG uses that we should
>> prioritize mapping?
>>
>> - Would schemas for principal authority (who directed vs who performed),
>> signature context (the capacity in which someone signs), and peer
>> endorsements be useful for VC implementations?
>>
>> - Is anyone working on similar compact-identifier approaches?
>>
>> Here's the detail on what we've built:
>>
>> *The Known Values Registry*
>>
>> BCR-2023-002 defines a namespace of 64-bit unsigned integers representing
>> ontological concepts — relationships, classes, properties, and enumerated
>> values. Each integer maps to a canonical name and equivalent URIs.
>>
>>
>> https://github.com/BlockchainCommons/Research/blob/master/papers/bcr-2023-002-known-value.md
>>
>> We needed compact binary representation and deterministic encoding, but
>> the registry itself is independent of any particular encoding. While we
>> serialize these as CBOR (#6.40000) for use with Gordian Envelope, the
>> codepoint-to-concept mappings stand alone and can be used in any format or
>> protocol.
>>
>> For example, rdf:type (codepoint 1) encodes as:
>>
>>     CBOR diagnostic: 40000(1)
>>     Bytes: d9 9c 40 01  (4 bytes)
>>
>> Compare that to the 47-byte URI "
>> http://www.w3.org/1999/02/22-rdf-syntax-ns#type". For documents with
>> many predicates, this adds up.
>>
>> *What's Already Mapped*
>>
>> We've assigned codepoints for several vocabularies this community uses:
>>
>> - RDF (2000-2049): 21 entries
>> - RDFS (2050-2099): 15 entries
>> - OWL 2 (2100-2199): 75 entries
>> - Dublin Core Elements (2200-2299): 15 entries
>> - Dublin Core Terms (2300-2499): 89 entries
>> - FOAF (2500-2699): 75 entries
>> - SKOS (2700-2799): 32 entries
>> - Solid (2800-2899): 33 entries
>> - W3C Verifiable Credentials (2900-2999): 28 entries
>> - GS1 Web Vocabulary (3000-3999): 609 entries
>> - Schema.org (10000-19999): 2450 entries
>>
>> These are 1:1 mappings — the Known Value codepoints reference the
>> canonical URIs from each ontology.
>>
>> *Emerging Schemas*
>>
>> We're also developing predicates for areas where we haven't found
>> existing schemas to leverage. These are currently in community review (see
>> the current PRs in the repository):
>>
>> - Principal Authority — predicates for expressing who directed a work vs
>> who performed it (e.g., human holds principalAuthority over AI-generated
>> content)
>>
>> - Signature Context — the capacity in which someone signs (e.g., CFO
>> signs onBehalfOf their corporation, not personally)
>>
>> - Fair Witness — neutral third-party observation attestations (e.g.,
>> notary attesting they observed a signature ceremony)
>>
>> - Peer Endorsement — skill and collaboration endorsements distinct from
>> formal credentials (e.g., colleague endorsing another's security expertise
>> based on project work)
>>
>> - CreativeWork Roles — contribution roles mapped to CRediT with ONIX,
>> MARC (e.g., distinguishing Author from Editor from Reviewer on a
>> collaborative work)
>>
>> I'm planning to make these available as schemas useable with JSON-LD and
>> other formats at https://assertions.info for those working outside
>> CBOR/Envelope contexts, if the W3C CCG community finds them useful.
>>
>> *Community Registry*
>>
>> Codepoints 100,000+ are open for community registration via automated
>> GitHub workflow — submit a JSON file, validation runs, and upon merge you
>> have registered codepoints. No gatekeeping beyond schema conformance and
>> uniqueness checks.
>>
>> *Resources*
>>
>> Full registry with JSON exports:
>>
>>
>> https://github.com/BlockchainCommons/Research/tree/master/known-value-assignments
>>
>> We also presented on Known Values at our January Gordian Community
>> meeting:
>>
>> Video:
>>     https://youtu.be/FiLNhx9BOuk?t=2658 (Known Value discussion starts
>> at 44:18)
>>
>> Transcript:
>>
>> https://developer.blockchaincommons.com/meetings/2026-01-gordian/transcript/#known-values-discussion
>>
>
> Mapping URIs to integers saves a few hundred bytes for typical documents,
> far less than general-purpose compression delivers for free, without
> requiring all implementations to remain synchronized against a centrally
> managed registry. When systems inevitably drift, integer codepoints fail
> silently (the same number meaning different things), whereas URIs fail
> loudly. The proposed ontology work on principal authority and peer
> endorsement may have merit, but bundling it with a bespoke compression
> mechanism couples two unrelated design decisions and makes both harder to
> evaluate on their own terms.
>
>
>>
>>
>> -- Christopher Allen
>>    Blockchain Commons
>>
>

Received on Tuesday, 3 February 2026 23:52:36 UTC