- From: Filip Kolarik <filip26@gmail.com>
- Date: Wed, 4 Feb 2026 00:52:20 +0100
- To: Melvin Carvalho <melvincarvalho@gmail.com>
- Cc: Christopher Allen <ChristopherA@lifewithalacrity.com>, Credentials Community Group <public-credentials@w3.org>, Wolf McNally <wolf@wolfmcnally.com>, Shannon Appelcline <shannon.appelcline@gmail.com>
- Message-ID: <CADRK2_O6SAwhK4m+iuTc3vfkLvsW-K0OBcURaYgxR-fEfn2GqA@mail.gmail.com>
Hi, CBOR-LD [1] uses shared dictionaries to improve compression ratios. These dictionaries can be generated from JSON-LD contexts or provided externally, and this information is encoded in the final CBOR-LD output. Why maintain a fixed registry for compact identifiers? Are these compact identifiers, which represent ontological terms, intended to be used standalone, or embedded within a larger representation? In the second case, a fixed registry may be unnecessary; it could be replaced with a dereferenceable "context" that maps terms to integers. Best regards, Filip https://www.linkedin.com/in/filipkolarik/ [1] https://github.com/filip26/iridium-cbor-ld On Wed, Feb 4, 2026 at 12:36 AM Melvin Carvalho <melvincarvalho@gmail.com> wrote: > > > út 3. 2. 2026 v 23:51 odesílatel Christopher Allen < > ChristopherA@lifewithalacrity.com> napsal: > >> *TL;DR:* I'm seeking CCG community input on a compact identifier >> registry we've developed at Blockchain Commons. Our Known Values Registry >> (BCR-2023-002) maps ontological concepts — predicates, classes, properties >> — to 64-bit integers, providing a compact binary representation while >> preserving semantic meaning. >> >> We've already mapped several vocabularies this community uses (RDF, RDFS, >> Dublin Core, FOAF, SKOS, Verifiable Credentials, Schema.org), and we're >> developing new schemas for areas like principal authority, signature >> context, and peer endorsements. >> >> *Three questions for the community:* >> >> - Are there other ontologies or vocabularies CCG uses that we should >> prioritize mapping? >> >> - Would schemas for principal authority (who directed vs who performed), >> signature context (the capacity in which someone signs), and peer >> endorsements be useful for VC implementations? >> >> - Is anyone working on similar compact-identifier approaches? >> >> Here's the detail on what we've built: >> >> *The Known Values Registry* >> >> BCR-2023-002 defines a namespace of 64-bit unsigned integers representing >> ontological concepts — relationships, classes, properties, and enumerated >> values. Each integer maps to a canonical name and equivalent URIs. >> >> >> https://github.com/BlockchainCommons/Research/blob/master/papers/bcr-2023-002-known-value.md >> >> We needed compact binary representation and deterministic encoding, but >> the registry itself is independent of any particular encoding. While we >> serialize these as CBOR (#6.40000) for use with Gordian Envelope, the >> codepoint-to-concept mappings stand alone and can be used in any format or >> protocol. >> >> For example, rdf:type (codepoint 1) encodes as: >> >> CBOR diagnostic: 40000(1) >> Bytes: d9 9c 40 01 (4 bytes) >> >> Compare that to the 47-byte URI " >> http://www.w3.org/1999/02/22-rdf-syntax-ns#type". For documents with >> many predicates, this adds up. >> >> *What's Already Mapped* >> >> We've assigned codepoints for several vocabularies this community uses: >> >> - RDF (2000-2049): 21 entries >> - RDFS (2050-2099): 15 entries >> - OWL 2 (2100-2199): 75 entries >> - Dublin Core Elements (2200-2299): 15 entries >> - Dublin Core Terms (2300-2499): 89 entries >> - FOAF (2500-2699): 75 entries >> - SKOS (2700-2799): 32 entries >> - Solid (2800-2899): 33 entries >> - W3C Verifiable Credentials (2900-2999): 28 entries >> - GS1 Web Vocabulary (3000-3999): 609 entries >> - Schema.org (10000-19999): 2450 entries >> >> These are 1:1 mappings — the Known Value codepoints reference the >> canonical URIs from each ontology. >> >> *Emerging Schemas* >> >> We're also developing predicates for areas where we haven't found >> existing schemas to leverage. These are currently in community review (see >> the current PRs in the repository): >> >> - Principal Authority — predicates for expressing who directed a work vs >> who performed it (e.g., human holds principalAuthority over AI-generated >> content) >> >> - Signature Context — the capacity in which someone signs (e.g., CFO >> signs onBehalfOf their corporation, not personally) >> >> - Fair Witness — neutral third-party observation attestations (e.g., >> notary attesting they observed a signature ceremony) >> >> - Peer Endorsement — skill and collaboration endorsements distinct from >> formal credentials (e.g., colleague endorsing another's security expertise >> based on project work) >> >> - CreativeWork Roles — contribution roles mapped to CRediT with ONIX, >> MARC (e.g., distinguishing Author from Editor from Reviewer on a >> collaborative work) >> >> I'm planning to make these available as schemas useable with JSON-LD and >> other formats at https://assertions.info for those working outside >> CBOR/Envelope contexts, if the W3C CCG community finds them useful. >> >> *Community Registry* >> >> Codepoints 100,000+ are open for community registration via automated >> GitHub workflow — submit a JSON file, validation runs, and upon merge you >> have registered codepoints. No gatekeeping beyond schema conformance and >> uniqueness checks. >> >> *Resources* >> >> Full registry with JSON exports: >> >> >> https://github.com/BlockchainCommons/Research/tree/master/known-value-assignments >> >> We also presented on Known Values at our January Gordian Community >> meeting: >> >> Video: >> https://youtu.be/FiLNhx9BOuk?t=2658 (Known Value discussion starts >> at 44:18) >> >> Transcript: >> >> https://developer.blockchaincommons.com/meetings/2026-01-gordian/transcript/#known-values-discussion >> > > Mapping URIs to integers saves a few hundred bytes for typical documents, > far less than general-purpose compression delivers for free, without > requiring all implementations to remain synchronized against a centrally > managed registry. When systems inevitably drift, integer codepoints fail > silently (the same number meaning different things), whereas URIs fail > loudly. The proposed ontology work on principal authority and peer > endorsement may have merit, but bundling it with a bespoke compression > mechanism couples two unrelated design decisions and makes both harder to > evaluate on their own terms. > > >> >> >> -- Christopher Allen >> Blockchain Commons >> >
Received on Tuesday, 3 February 2026 23:52:36 UTC