- From: Orie Steele <orie@transmute.industries>
- Date: Fri, 24 Jul 2020 11:55:49 -0500
- To: Leonard Rosenthol <lrosenth@adobe.com>
- Cc: Manu Sporny <msporny@digitalbazaar.com>, "public-credentials@w3.org" <public-credentials@w3.org>
- Message-ID: <CAN8C-_J3JEJ4GzS_UivoDQ-trcrkUP6akM-f_fb+u2JS7HXrWQ@mail.gmail.com>
Sorry I am late to the CBOR-LD Party! Very excited to have a semantic linked data format that is also usable in a compact binary representation, and to have bi-directional transformation out of the box... I have been playing with CBOR on the weekends, and I have a repo here: https://github.com/transmute-industries/decentralized-cbor/blob/master/src/__fixtures__/outputs/table.csv The repo compares, JSON, JSON-LD, CBOR, DAG_CBOR and ZLIB_URDNA2015_CBOR ( another approach at compressed linked data format in CBOR)... I am eager to add tests for CBOR-LD. both DAG_CBOR and CBOR-LD have some benefits over CBOR and ZLIB_URDNA2015_CBOR and JSON.... Both are linked data formats where the linked data aspect is preserved at the binary level. ZLIB_URDNA2015_CBOR is just a compressed JSON-LD object encoded as CBOR, you cannot leverage internal semantics... in much the same way you cannot leverage internal semantics of "Pure JSON" and "Pure CBOR".... However, ZLIB_URDNA2015_CBOR is MUCH smaller than DAG_CBOR / "Pure CBOR" that was built from "Pure JSON", and CBOR-LD is MUCH smaller than ZLIB_URDNA2015_CBOR... Backing up for a second, one way to think about why CBOR-LD is awesome is to consider how all software that processes data, has some opinion about that data... sometimes these opinions are encoded in schema validation of incoming data (using tools like JSON Schema or ProtoBuff)... If you consider that changes to data on the wire would cause the software to explode... you can see why agreeing to a common context, is similar to agreeing to a data schema.... And by relying on an existing context to build a compressed binary representation of a semantic object, we can leverage these "common dictionaries / vocabularies" not just for semantic disambiguation, but also for compression.... Obviously the IoT space has been waiting for something like this for a long time... - https://www.w3.org/WoT/ - https://github.com/Azure/opendigitaltwins-dtdl/blob/master/DTDL/v2/dtdlv2.md We are now able to convert all these ontologies and semantic vocabularies, into compact, interoperable, binary representations for industries that have already committed to the semantic web: https://github.com/semantalytics/awesome-semantic-web#ontologies I'm not sure of the potential internal representation benefits for services like https://developers.google.com/knowledge-graph but obviously, a small IOT device that only speaks CBOR-LD would not need to crack out a JSON parser and all the attack surface associated with it, just to talk to the knowledge graph service. OS On Fri, Jul 24, 2020 at 10:54 AM Leonard Rosenthol <lrosenth@adobe.com> wrote: > It's not just specific schemas but also the order of the schemas, any > other keys you add, plus additional "techniques" you add. > > Using your presentation as a guide: > Slide 11: > > In that case you have picked a single schema, found all the items, and > given the unique value (let's say 1-10.). Now (not shown on the slide, > but...), I assume that you then pick another schema and start allocating > values for it in the dictionary (eg. 11-20), and so on. At some point the > credentials schema is updated (1.1->1.2) - but you can't update the > existing entries in the dictionary and just add the new ones to the end > (eg. 100-105). And then you encode something using that dictionary - how > does something downstream know that you are using the 1.2 version of the > context? It would simply have a 100 in there - but w/o that in the > dictionary, it's not decodable. > > > Slide 14: > > This is a good example of how to reduce size by switching from a string > representation to binary. I assume we will find more of those cases over > time. *BUT* a decoder needs to understand this encoding approach - but > again, how would they recognize something new? > > > At a minimum, we need a way to encode the version of the CBOR-SC algorithm > that is used to encode a given data set. That would go a *long way* to > resolving my concerns. > > Leonard > > On 7/24/20, 11:19 AM, "Manu Sporny" <msporny@digitalbazaar.com> wrote: > > On 7/24/20 11:00 AM, Leonard Rosenthol wrote: > > However, the main use case that you present in the presentation is > > QRCodes - which exist as a mechanism to move from digital to analog > > (and back). The analog world is long lived - even if not > > necessarily archival - and the data needs to be retrievable. And > > that can't happen w/o knowing the right (version of the) dictionary > > to use. > > ... which is why we strongly suggest that all production contexts > should > be versioned, frozen, and cryptographically hashed. There is a general > mitigation for your concern. :) > > To be clear, this issue is well known in the JSON-LD ecosystem and that > ecosystem has thrived (deployed on tens of millions of domains) in > spite > of the danger. That community has learned how to manage constantly > evolving vocabularies (schema.org), and how to lock vocabularies down > (VCs). > > There are solutions to the problem you outline, cryptographically > hashing URLs is one thing we explored, but that bloats the size of the > CBOR-LD bytes. Like any technology, CBOR-LD is a series of difficult > design trade-offs. > > Just like we made the conscious decision in JSON-LD to be able to > reference external JSON-LD Context files (which people insisted was > madness and unworkable when we did it... and still do), we make the > same > conscious decision now (because it worked out pretty well for JSON-LD, > and it's not clear how doing the same thing in CBOR-LD would be any > different). > > If we wanted to eliminate the risk you highlighted, we wouldn't be able > to solve the most pressing use cases. > > -- manu > > -- > Manu Sporny - > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linkedin.com%2Fin%2Fmanusporny%2F&data=02%7C01%7Clrosenth%40adobe.com%7C068dbd2266774d9df7c108d82fe4ec40%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637312007547071439&sdata=9FPko04mJd9Ti%2FqTUGWCAA9L8v6V4N1TfQTeC%2BSwyr0%3D&reserved=0 > Founder/CEO - Digital Bazaar, Inc. > blog: Veres One Decentralized Identifier Blockchain Launches > > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftinyurl.com%2Fveres-one-launches&data=02%7C01%7Clrosenth%40adobe.com%7C068dbd2266774d9df7c108d82fe4ec40%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637312007547071439&sdata=VRjEMw2dMaAme%2F5ZYMLf7EhcLxxHcyu%2B5rCEOx4N2dU%3D&reserved=0 > > -- *ORIE STEELE* Chief Technical Officer www.transmute.industries <https://www.transmute.industries>
Received on Friday, 24 July 2020 16:56:18 UTC