Re: CBOR-LD musings

I’ve been thinking about pre-computed dictionary use in a variety of cases recently – whether it be traditional compression scenarios (such as with Brotli) or specialized like CBOR-LD.  While there is no question that such an approach can significantly improve compression, it also makes the output data “not as future proof”, unless you expect that the dictionary never changes (or changes in a compatible manner).

I work on formats/technologies that are designed to be around for anywhere from decades to hundreds of years (or more) – including PDF and C2PA.  That means that it needs to be possible for files produced today to be decoded way into the future, which means that the dictionary has to be “fixed” with any given version and can’t differ from implementation from implementation.  It also means that any future dictionary has to be backwards compatible with all previous versions – aka you can’t change the order/value of any dictionary element, you can only add.

I hope that these are considerations that would be considered as part of the evaluation process for technology choices.

Leonard

From: Filip Kolarik <filip26@gmail.com>
Date: Tuesday, November 28, 2023 at 8:05 AM
To: Manu Sporny <msporny@digitalbazaar.com>
Cc: Gregg Kellogg <gregg@greggkellogg.net>, JSON for Linking Data Community Group <public-linked-json@w3.org>, JSON-LD Working Group <public-json-ld-wg@w3.org>
Subject: Re: CBOR-LD musings

EXTERNAL: Use caution when clicking on links or opening attachments.


Hi Gregg,
I've implemented CBOR-LD in Java in a configurable way. DB's implementation compatibility configuration included.

The algorithm is quite wild trying to squeeze out most of the JSON-LD syntactic sugar, and deliver minimal compressed output. I guess the main motivation at the beginning was to get a small footprint that can be encoded as QR code.

Basically the algorithm tries to compress terms, types, and values. See here CRBOR-LD DB Configuration<https://github.com/filip26/iridium-cbor-ld/blob/main/src/main/java/com/apicatalog/cborld/db/DbConfig.java>.

I would be happy to share my experience with you - btw. The current proposed algorithm can be even improved.

The only thing I'm not sure about is to make it part of the JSON-LD algorithms. It's quite a complex implementation that could be generalized to use different algorithms, different strategies.

Best,
Filip








On Tue, Nov 28, 2023 at 1:05 AM Manu Sporny <msporny@digitalbazaar.com<mailto:msporny@digitalbazaar.com>> wrote:
On Mon, Nov 27, 2023 at 6:29 PM Gregg Kellogg <gregg@greggkellogg.net<mailto:gregg@greggkellogg.net>> wrote:
> It would be great to have a discussion on DB’s CBOR-LD spec to understand what it is trying to do.

Sure, and happy to have that discussion (and share the production
deployments we have today for CBOR-LD). The spec is in a fairly
outdated state due to all the other spec work we've been doing that is
under more immediate time pressures. Your commentary on what's missing
today is accurate and we'd like to fix that (and haven't been able to
given all the other time pressures and customer demands).

That said, the CBOR-LD implementation here is in production usage (for
the TruAge program, which is also used by the State of California in
their DMV app):

https://github.com/digitalbazaar/cborld


... and we have a very long and outstanding set of issues to update
the CBOR-LD spec to bring it inline w/ the CBOR-LD implementation we
have and then transfer ownership over to the JSON-LD WG (if there is
consensus to take the work on).

-- manu

--
Manu Sporny - https://www.linkedin.com/in/manusporny/

Founder/CEO - Digital Bazaar, Inc.
https://www.digitalbazaar.com/

Received on Tuesday, 28 November 2023 14:53:03 UTC