Re: CBOR-LD musings from Filip Kolarik on 2023-11-28 (public-linked-json@w3.org from November 2023)

From: Filip Kolarik <filip26@gmail.com>
Date: Tue, 28 Nov 2023 16:35:16 +0100
To: Leonard Rosenthol <lrosenth@adobe.com>
Cc: Manu Sporny <msporny@digitalbazaar.com>, Gregg Kellogg <gregg@greggkellogg.net>, JSON for Linking Data Community Group <public-linked-json@w3.org>, JSON-LD Working Group <public-json-ld-wg@w3.org>
Message-ID: <CADRK2_M+hTnWhceipV8BxhpLDhpYBGbtGdpNH3pT2dOqyV2K8w@mail.gmail.com>

On Tue, Nov 28, 2023 at 3:52 PM Leonard Rosenthol <lrosenth@adobe.com>
wrote:
*... cutted off ...*

> I hope that these are considerations that would be considered as part of
> the evaluation process for technology choices.
>
>
The current algorithm creates a dictionary from contexts (in processing
order) that are applied/effective/ to the given document to compress, and
sorts the terms. Therefore is immune to adding new terms to a context or to
the order of the terms.

A referenced context is expected to be immutable, that's a design choice
that might work for those of us who do not plan for centuries ;) but there
could be more than just one strategy to create a dictionary, to suit
different needs.

On Tue, Nov 28, 2023 at 3:52 PM Leonard Rosenthol <lrosenth@adobe.com>
wrote:

> I’ve been thinking about pre-computed dictionary use in a variety of cases
> recently – whether it be traditional compression scenarios (such as with
> Brotli) or specialized like CBOR-LD.  While there is no question that such
> an approach can significantly improve compression, it also makes the output
> data “not as future proof”, unless you expect that the dictionary never
> changes (or changes in a compatible manner).
>
>
>
> I work on formats/technologies that are designed to be around for anywhere
> from decades to hundreds of years (or more) – including PDF and C2PA.  That
> means that it needs to be possible for files produced today to be decoded
> way into the future, which means that the dictionary has to be “fixed” with
> any given version and can’t differ from implementation from
> implementation.  It also means that any future dictionary has to be
> backwards compatible with all previous versions – aka you can’t change the
> order/value of any dictionary element, you can only add.
>
>
>
> I hope that these are considerations that would be considered as part of
> the evaluation process for technology choices.
>
>
>
> Leonard
>
>
>
> *From: *Filip Kolarik <filip26@gmail.com>
> *Date: *Tuesday, November 28, 2023 at 8:05 AM
> *To: *Manu Sporny <msporny@digitalbazaar.com>
> *Cc: *Gregg Kellogg <gregg@greggkellogg.net>, JSON for Linking Data
> Community Group <public-linked-json@w3.org>, JSON-LD Working Group <
> public-json-ld-wg@w3.org>
> *Subject: *Re: CBOR-LD musings
>
> *EXTERNAL: Use caution when clicking on links or opening attachments.*
>
>
>
> Hi Gregg,
>
> I've implemented CBOR-LD in Java in a configurable way. DB's
> implementation compatibility configuration included.
>
>
>
> The algorithm is quite wild trying to squeeze out most of the JSON-LD
> syntactic sugar, and deliver minimal compressed output. I guess the main
> motivation at the beginning was to get a small footprint that can be
> encoded as QR code.
>
>
>
> Basically the algorithm tries to compress terms, types, and values. See
> here CRBOR-LD DB Configuration
> <https://github.com/filip26/iridium-cbor-ld/blob/main/src/main/java/com/apicatalog/cborld/db/DbConfig.java>
> .
>
>
>
> I would be happy to share my experience with you - btw. The current
> proposed algorithm can be even improved.
>
>
>
> The only thing I'm not sure about is to make it part of the JSON-LD
> algorithms. It's quite a complex implementation that could be generalized
> to use different algorithms, different strategies.
>
>
>
> Best,
>
> Filip
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Tue, Nov 28, 2023 at 1:05 AM Manu Sporny <msporny@digitalbazaar.com>
> wrote:
>
> On Mon, Nov 27, 2023 at 6:29 PM Gregg Kellogg <gregg@greggkellogg.net>
> wrote:
> > It would be great to have a discussion on DB’s CBOR-LD spec to
> understand what it is trying to do.
>
> Sure, and happy to have that discussion (and share the production
> deployments we have today for CBOR-LD). The spec is in a fairly
> outdated state due to all the other spec work we've been doing that is
> under more immediate time pressures. Your commentary on what's missing
> today is accurate and we'd like to fix that (and haven't been able to
> given all the other time pressures and customer demands).
>
> That said, the CBOR-LD implementation here is in production usage (for
> the TruAge program, which is also used by the State of California in
> their DMV app):
>
> https://github.com/digitalbazaar/cborld
>
> ... and we have a very long and outstanding set of issues to update
> the CBOR-LD spec to bring it inline w/ the CBOR-LD implementation we
> have and then transfer ownership over to the JSON-LD WG (if there is
> consensus to take the work on).
>
> -- manu
>
> --
> Manu Sporny - https://www.linkedin.com/in/manusporny/
> Founder/CEO - Digital Bazaar, Inc.
> https://www.digitalbazaar.com/
>
>

Received on Tuesday, 28 November 2023 15:35:35 UTC