W3C home > Mailing lists > Public > public-credentials@w3.org > February 2022

Re: VC API: handling large documents client to server

From: Anders Rundgren <anders.rundgren.net@gmail.com>
Date: Thu, 10 Feb 2022 21:15:44 +0100
Message-ID: <b86044a8-099d-82cf-094f-7562430db1b2@gmail.com>
To: Nikos Fotiou <fotiou@aueb.gr>, 'Orie Steele' <orie@transmute.industries>, 'Julien Fraichot' <Julien.Fraichot@hyland.com>
Cc: 'Mike Prorock' <mprorock@mesur.io>, 'Manu Sporny' <msporny@digitalbazaar.com>, 'W3C Credentials CG' <public-credentials@w3.org>
On 2022-02-10 20:30, Nikos Fotiou wrote:
> Hi Orie,
> 
> Can you please provide some more information about why “base64url encoding is a major problem for signatures over complex data”. At least to me, it is not obvious.

If you want to maintain the original message structure after signing as well as keeping binary data "as is", deterministic CBOR may be an option:

https://test.webpki.org/csf-lab/home

Thanx,
Anders

> 
> Thanks,
> 
> Nikos
> 
> *From:*Orie Steele <orie@transmute.industries>
> *Sent:* Thursday, February 10, 2022 6:37 PM
> *To:* Julien Fraichot <Julien.Fraichot@hyland.com>
> *Cc:* Mike Prorock <mprorock@mesur.io>; Manu Sporny <msporny@digitalbazaar.com>; W3C Credentials CG <public-credentials@w3.org>
> *Subject:* Re: [EXTERNAL] [jfraichot@learningmachine.com] Re: VC API: handling large documents client to server
> 
> base64url encoding is a major problem for signatures over complex data... when there is no applied compression.
> 
> This needs to be fixed in the next version of the VC Data Model, if VC-JWT is to be useful for large credentials.
> 
> Luckily there is https://www.iana.org/assignments/jose/jose.xhtml#web-encryption-compression-algorithms <https://www.iana.org/assignments/jose/jose.xhtml#web-encryption-compression-algorithms>
> 
> Regards,
> 
> OS
> 
> Image removed by sender.ᐧ
> 
> On Thu, Feb 10, 2022 at 10:00 AM Julien Fraichot <Julien.Fraichot@hyland.com <mailto:Julien.Fraichot@hyland.com>> wrote:
> 
>     Thanks Manu for the great write up,
> 
>     It’s true that Blockcerts historically has made heavy use of base64 content as part of the data, and it was a partial problem since we didn’t have much transport. Now that we are considering the problem it’s great to hear what could be the industry standard, so I will try out the different things you suggest.
> 
>     *From: *Mike Prorock <mprorock@mesur.io <mailto:mprorock@mesur.io>>
>     *Date: *Thursday, 10 February 2022 at 15:49
>     *To: *Manu Sporny <msporny@digitalbazaar.com <mailto:msporny@digitalbazaar.com>>
>     *Cc: *W3C Credentials CG <public-credentials@w3.org <mailto:public-credentials@w3.org>>
>     *Subject: *[EXTERNAL] [jfraichot@learningmachine.com <mailto:jfraichot@learningmachine.com>] Re: VC API: handling large documents client to server
> 
>     *CAUTION: *This email originated from outside of Hyland. Do not click links or open attachments unless you recognize the sender and know the content is safe.
> 
>     +1 manu - great notes - I will note from our side that we utilize hashlink (and related approaches) when dealing with larger binary data that needs to be referenced in a credential - e.g. image of a product, etc.
> 
> 
>     Mike Prorock
> 
>     CTO, Founder
> 
>     https://mesur.io/ <https://mesur.io/>
> 
>     On Thu, Feb 10, 2022 at 9:39 AM Manu Sporny <msporny@digitalbazaar.com <mailto:msporny@digitalbazaar.com>> wrote:
> 
>         On 2/10/22 3:01 AM, Julien Fraichot wrote:
>          > Basically when calling a verification API from a (browser) client, there
>          > might be times where the documents could be quite large (few MBs). I am
>          > wondering if there are some strategies to reduce the payload that would
>          > also be standard when dealing with a VC API complying service?
> 
>         If you have a VC that is several MB in size, I would expect it to struggle in
>         the ecosystem. Yes, they are legal, in the same way that attaching a 250MB
>         slide deck to an email is legal -- while the SMTP protocol allows for it, most
>         mail servers will reject the message as too large.
> 
>         Typically, these large VCs happen because people are embedding base-encoded
>         images directly into a VC. Instead, VC creators should consider modelling
>         their data differently -- e.g., use a hashlink, or some other way of creating
>         a cryptographic hyperlink:
> 
>         https://datatracker.ietf.org/doc/html/draft-sporny-hashlink <https://datatracker.ietf.org/doc/html/draft-sporny-hashlink>
> 
>          > I am researching gzipping on the client and tried more exotic approaches to
>          > no avail, so I’d be willing to hear what the people have thought on the
>          > matter.
> 
>         You get gzip compression during transmission (more or less) for free these
>         days, but that's not really going to save you given that you're probably
>         base-encoding the raw binary data. Quite counter-intuitively, doing that makes
>         using gzip expand the file size instead of reducing it:
> 
>         https://stackoverflow.com/questions/38124361/why-does-base64-encoded-data-compress-so-poorly <https://stackoverflow.com/questions/38124361/why-does-base64-encoded-data-compress-so-poorly>
> 
>         This is another reason the JOSE/JWT stack, when used with VCs, harm wire-level
>         protocols -- everything is base64 encoded, and thus it effectively destroys
>         any ability to compress data on the wire.
> 
>         Typical solutions to this problem require that you put the binary data outside
>         of the VC, if at all possible. This works well for common static images such
>         as logos. It is also possible to split the VC into two VCs... one with the
>         machine-readable data from the issuer (with a digital signature) and one with
>         the image data from any source (without a digital signature, since, if
>         hashlinked, the signature will verify the validity of the image data). That
>         latter approach can be more privacy preserving AND more complex than many
>         might feel is necessary.
> 
>         Selective disclosure schemes (such as BBS+) are another way to deliver a
>         subset of the information to a verifier without having to send the image
>         payload data.
> 
>         I expect this to be an active area of innovation for the next few years with a
>         few proposals on standard design patterns that all industries could use. This
>         problem appears most often with identification cards that have biometric
>         images embedded in them.
> 
>         -- manu
> 
>         -- 
>         Manu Sporny - https://www.linkedin.com/in/manusporny/ <https://www.linkedin.com/in/manusporny/>
>         Founder/CEO - Digital Bazaar, Inc.
>         News: Digital Bazaar Announces New Case Studies (2021)
>         https://www.digitalbazaar.com/ <https://www.digitalbazaar.com/>
> 
>     ----------------------------------------- Please consider the environment before printing this e-mail -----------------------------------------
> 
>     CONFIDENTIALITY NOTICE: This message and any attached documents may contain confidential information from Hyland Software, Inc. The information is intended only for the use of the individual or entity named above. If the reader of this message is not the intended recipient, or an employee or agent responsible for the delivery of this message to the intended recipient, the reader is hereby notified that any dissemination, distribution or copying of this message or of any attached documents, or the taking of any action or omission to take any action in reliance on the contents of this message or of any attached documents, is strictly prohibited. If you have received this communication in error, please notify the sender immediately by e-mail or telephone, at +1 (440) 788-5000, and delete the original message immediately. Thank you.
> 
> 
> -- 
> 
> *ORIE STEELE*
> 
> Chief Technical Officer
> 
> www.transmute.industries
> 
> Image removed by sender. <https://www.transmute.industries/>
> 
Received on Thursday, 10 February 2022 20:16:02 UTC

This archive was generated by hypermail 2.4.0 : Thursday, 24 March 2022 20:25:28 UTC