Re: Introducing CBOR-LD... from Leonard Rosenthol on 2020-07-24 (public-credentials@w3.org from July 2020)

From: Leonard Rosenthol <lrosenth@adobe.com>
Date: Fri, 24 Jul 2020 15:52:32 +0000
To: Manu Sporny <msporny@digitalbazaar.com>, "public-credentials@w3.org" <public-credentials@w3.org>
Message-ID: <CEA32363-28A6-4CA3-B613-6A70366B3D98@adobe.com>
It's not just specific schemas but also the order of the schemas, any other keys you add, plus additional "techniques" you add.

Using your presentation as a guide:
Slide 11: 

In that case you have picked a single schema, found all the items, and given the unique value (let's say 1-10.).  Now (not shown on the slide, but...), I assume that you then pick another schema and start allocating values for it in the dictionary (eg. 11-20), and so on.   At some point the credentials schema is updated (1.1->1.2) - but you can't update the existing entries in the dictionary and just add the new ones to the end (eg. 100-105).  And then you encode something using that dictionary - how does something downstream know that you are using the 1.2 version of the context?  It would simply have a 100 in there - but w/o that in the dictionary, it's not decodable.

 
Slide 14:

This is a good example of how to reduce size by switching from a string representation to binary.  I assume we will find more of those cases over time.   *BUT* a decoder needs to understand this encoding approach - but again, how would they recognize something new?


At a minimum, we need a way to encode the version of the CBOR-SC algorithm that is used to encode a given data set.   That would go a *long way* to resolving my concerns.

Leonard

On 7/24/20, 11:19 AM, "Manu Sporny" <msporny@digitalbazaar.com> wrote:

    On 7/24/20 11:00 AM, Leonard Rosenthol wrote:
    > However, the main use case that you present in the presentation is
    > QRCodes - which exist as a mechanism to move from digital to analog
    > (and back).   The analog world is long lived - even if not
    > necessarily archival - and the data needs to be retrievable.  And
    > that can't happen w/o knowing the right (version of the) dictionary
    > to use.

    ... which is why we strongly suggest that all production contexts should
    be versioned, frozen, and cryptographically hashed. There is a general
    mitigation for your concern. :)

    To be clear, this issue is well known in the JSON-LD ecosystem and that
    ecosystem has thrived (deployed on tens of millions of domains) in spite
    of the danger. That community has learned how to manage constantly
    evolving vocabularies (schema.org), and how to lock vocabularies down (VCs).

    There are solutions to the problem you outline, cryptographically
    hashing URLs is one thing we explored, but that bloats the size of the
    CBOR-LD bytes. Like any technology, CBOR-LD is a series of difficult
    design trade-offs.

    Just like we made the conscious decision in JSON-LD to be able to
    reference external JSON-LD Context files (which people insisted was
    madness and unworkable when we did it... and still do), we make the same
    conscious decision now (because it worked out pretty well for JSON-LD,
    and it's not clear how doing the same thing in CBOR-LD would be any
    different).

    If we wanted to eliminate the risk you highlighted, we wouldn't be able
    to solve the most pressing use cases.

    -- manu

    -- 
    Manu Sporny - https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linkedin.com%2Fin%2Fmanusporny%2F&amp;data=02%7C01%7Clrosenth%40adobe.com%7C068dbd2266774d9df7c108d82fe4ec40%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637312007547071439&amp;sdata=9FPko04mJd9Ti%2FqTUGWCAA9L8v6V4N1TfQTeC%2BSwyr0%3D&amp;reserved=0

    Founder/CEO - Digital Bazaar, Inc.
    blog: Veres One Decentralized Identifier Blockchain Launches
    https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftinyurl.com%2Fveres-one-launches&amp;data=02%7C01%7Clrosenth%40adobe.com%7C068dbd2266774d9df7c108d82fe4ec40%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637312007547071439&amp;sdata=VRjEMw2dMaAme%2F5ZYMLf7EhcLxxHcyu%2B5rCEOx4N2dU%3D&amp;reserved=0
Received on Friday, 24 July 2020 15:52:47 UTC