- From: Christopher Allen <ChristopherA@lifewithalacrity.com>
- Date: Fri, 24 Jul 2020 15:58:03 -0700
- To: Manu Sporny <msporny@digitalbazaar.com>
- Cc: "public-credentials@w3.org" <public-credentials@w3.org>, Wolf McNally <wolf@wolfmcnally.com>
- Message-ID: <CACrqygAZTLUKLpf0kcg4q4scqfkKnLb7znDa7rjG-oCr1aWAPw@mail.gmail.com>
On Fri, Jul 24, 2020 at 2:31 PM Manu Sporny <msporny@digitalbazaar.com> wrote: > On 7/24/20 12:55 PM, Orie Steele wrote: > > The repo compares, JSON, JSON-LD, CBOR, DAG_CBOR and ZLIB_URDNA2015_CBOR > > ( another approach at compressed linked data format in CBOR)... I am > > eager to add tests for CBOR-LD. > > I'm eager to see the results as well, Orie... I'm wondering if you'd be > willing to expand your comparison table to the types on slide 6? > > > https://docs.google.com/presentation/d/1ksh-gUdjJJwDpdleasvs9aRXEmeRvqhkVWqeitx5ZAE/edit#slide=id.g866980c4a6_0_14 > > It might be useful to see how different types of data encodings that are > commonly used fare. For example, it's useful to understand that because > base64-encoded JWTs use 6-bit encoding that things that could have > normally been LZ compressed cannot be compressed because of the > bit-carrying nature of base64's 6-bit encoding. > We did some research on data encoding issues, with some tables at: https://github.com/BlockchainCommons/Research/blob/master/papers/bcr-2020-003-uri-binary-compatibility.md It turns out to be a lot more complicated when you add in QR to the equation. Despite the "compression" of base64url running it through a QR was actually less efficient than hexadecimal! This is because QR thinks it is binary, and then expands it once again to the QR encoding format, introducing a significant increase in size. In addition, it does not try to internally compress binary. The careful selection of either the BC32 character set, or take advantage of some of the other benefits of ByteWords encoding character set, allowed us to leverage the QR standard's internal compression. By carefully separating the transfer encoding scheme (to optimize for QR) from the binary encoding scheme (CBOR), we were able to get a significantly larger amount of data in a single QR. The problem with many other approaches is that they try to do the transfer encoding scheme, the binary encoding scheme, the self-describing encoding scheme, and the error-detection encoding scheme all in the same layer. For instance, we found Digital Bazaar's fountain encoding scheme for QRs to be at the wrong layer, so proposed one that we believe provably works better with more devices and smaller QR code frames, but does not have any cost at the binary level. What I'd like to see is how we might be able to combine these efforts, or at least cooperate. -- Christopher Allen
Received on Friday, 24 July 2020 22:58:54 UTC