Re: Introducing CBOR-LD... from Manu Sporny on 2020-07-24 (public-credentials@w3.org from July 2020)

From: Manu Sporny <msporny@digitalbazaar.com>
Date: Fri, 24 Jul 2020 17:02:24 -0400
To: "public-credentials@w3.org" <public-credentials@w3.org>
Message-ID: <2a0ed89d-abdc-6efa-8035-f7ed1bf899ab@digitalbazaar.com>

On 7/24/20 11:52 AM, Leonard Rosenthol wrote:
> It's not just specific schemas but also the order of the schemas, any
> other keys you add, plus additional "techniques" you add.

Yes, that's correct, and the current algorithm takes that into account.

> And then you encode something
> using that dictionary - how does something downstream know that you
> are using the 1.2 version of the context?  

There's a disconnect between your mental model and the way CBOR-LD works
currently, I'm trying to deduce where that disconnect is...

To directly answer your question above... the 1.2 version would be in
the context. It would literally be:

"@context": "https://example.com/myvocab/v1.1"

and then you'd see it switch to:

"@context": "https://example.com/myvocab/v1.2"

That's how you know someone changed from v1.1 to v1.2. It would be in
the input data. JSON-LD Context files can be, and are strongly urged to
be, versioned.

> This is a good example of how to reduce size by switching from a
> string representation to binary.  I assume we will find more of those
> cases over time.   *BUT* a decoder needs to understand this encoding
> approach - but again, how would they recognize something new?

... because there is a version byte on the CBOR Tag for CBOR-LD. At
present, the version is 0x00 (uncompressed CBOR), and (0x01 CBOR-LD
compression v1).

If we create new global decoders, we would bump the byte value up by one
to 0x02. The set of default codecs in play are always identified by the
CBOR-LD compression algorithm byte.

See Section 3.3 Compressed CBOR-LD Buffer Algorithm, Step #2:

https://digitalbazaar.github.io/cbor-ld-spec/#compressed-cbor-ld-buffer-algorithm

> At a minimum, we need a way to encode the version of the CBOR-SC
> algorithm that is used to encode a given data set.   That would go a
> *long way* to resolving my concerns.

Good! We already do that here:

https://digitalbazaar.github.io/cbor-ld-spec/#compressed-cbor-ld-buffer-algorithm

Yes, the risks you outline are not entirely eliminated, but they are
lowered to the point that the risk is managable, IMHO.

If we take Dave Longley's advice of requiring JSON-LD Contexts
registered in the "CBOR-LD: Known Contexts" registry to also be
associated with a cryptographic hash value for each context, which I
think is a good requirement for the global registry, then have we
acceptably de-risked both of your concerns, Leonard?

-- manu

-- 
Manu Sporny - https://www.linkedin.com/in/manusporny/
Founder/CEO - Digital Bazaar, Inc.
blog: Veres One Decentralized Identifier Blockchain Launches
https://tinyurl.com/veres-one-launches

Received on Friday, 24 July 2020 21:02:39 UTC