W3C home > Mailing lists > Public > public-credentials@w3.org > October 2020

Re: using PDFs as a VC container...

From: Leonard Rosenthol <lrosenth@adobe.com>
Date: Wed, 21 Oct 2020 13:03:51 +0000
To: Kostas Karasavvas <kkarasavvas@gmail.com>
CC: Credentials Community Group <public-credentials@w3.org>
Message-ID: <CAB71C3A-7949-4F58-A333-E86FA77664D6@adobe.com>
Thanks – that helps!

The iText library that you are using is fully capable of manipulating XMP and doing embedding and other techniques that I showed in my presentation.  I haven’t looked at the state of Python PDF libraries in a while so I don’t know offhand which (if any) are XMP capable – though given that it has been part of PDF for almost 20 years and is now mandated for PDF 2.0 – I would hope they do!

The approach you are using for hashing & verifying (with the post-hash addition and then removal) won’t work long term.  It happens to work with the tools that you are using – but it’s actually not a guarantee from a file format perspective.   The standard approach for embedding a signature into the thing you are signing (not just PDF but other formats) is a two pass approach involving the use of a “hole” in the file.  This is how PDF signatures have worked for 20+ years now.   Here is a good paper about it - https://www.adobe.com/devnet-docs/etk_deprecated/tools/DigSig/Acrobat_DigitalSignatures_in_PDF.pdf


And yes, I agree that we need to come up with a standard approach for where/how to embed the VCs in the PDFs, how the signing & verification process works, etc.   Looking forward to it.

Leonard

From: Kostas Karasavvas <kkarasavvas@gmail.com>
Date: Wednesday, October 21, 2020 at 3:26 AM
To: Leonard Rosenthol <lrosenth@adobe.com>
Cc: Credentials Community Group <public-credentials@w3.org>
Subject: Re: using PDFs as a VC container...

Hi Leonard and thank you for the comments!

Please see inline.

On Tue, Oct 20, 2020 at 9:13 PM Leonard Rosenthol <lrosenth@adobe.com<mailto:lrosenth@adobe.com>> wrote:
Very interesting, Kostas!   Thanks for sharing.

I went to the repos below to see what you’ve done.  I peeked inside the PDFs and saw how you are storing data – while it works, it is definitely not a good approach.

This is something that we would like to investigate further. Ideally, it would be improved as part of the VC-compatibility update that we want.

Your knowledge would be very helpful on this. What approach would you recommend?

And any python and JS libraries that can help insert/remove the metadata with the recommend approach?

 What I didn’t see though is the actual source code for the hashing and embedding – because I am curious about the order of operations of those two operations since you couldn’t modify after hashing – yet that is what I see happening…


Check https://github.com/verifiable-pdfs/blockchain-certificates/blob/5159a38d320280ab46092ab56450028435aaf6cb/blockchain_certificates/issue_certificates.py#L78<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fverifiable-pdfs%2Fblockchain-certificates%2Fblob%2F5159a38d320280ab46092ab56450028435aaf6cb%2Fblockchain_certificates%2Fissue_certificates.py%23L78&data=04%7C01%7Clrosenth%40adobe.com%7C5c0d311ec8564d3ab9b808d87592ab9e%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637388620093934005%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=QMluLJxeGvd0GYcNN4EUAg4cVLR%2FxalLm7G9rhvKsCM%3D&reserved=0>

The documents are hashed and then their merkle root is constructed. The proof is inserted into the PDF at the end. During validation the proof is removed first, and then the document is hashed again to get the original hash that was anchored into the blockchain.

We had thought of several alternative approaches but this seemed to be the best one.

I see a lot of common ground. If it is of interest to you I would be happy to discuss further or even collaborate on this if it makes sense.

Thanks again for the feedback!
Kostas


Thanks,
Leonard

From: Kostas Karasavvas <kkarasavvas@gmail.com<mailto:kkarasavvas@gmail.com>>
Date: Tuesday, October 20, 2020 at 11:42 AM
To: Credentials Community Group <public-credentials@w3.org<mailto:public-credentials@w3.org>>
Subject: using PDFs as a VC container...
Resent-From: <public-credentials@w3.org<mailto:public-credentials@w3.org>>
Resent-Date: Tuesday, October 20, 2020 at 11:40 AM


Hi all!

I would like to introduce the work that I have been involved with. I have been following the list/group for a while now and want to try to find the time to be more active   ... that, plus Leonard's presentation was a great opportunity/introduction :-)

TL;DR:
We have been using PDFs as a credentials' container and anchoring the document hashes on the blockchain. A blockchain proof (chainpointv2) is added in the PDF metadata which results in self-contained and self-verifiable PDF documents; the PDF is the only thing needed to view the credential and validate the credential*. There are open source validators that are trivial to use and/or host.

The longer version:
The University of Nicosia has been anchoring/issuing credentials (PDFs) for some of their courses on the Bitcoin blockchain since 2014. Back then the process was more manual/adhoc. In 2016 the solution was re-designed to (more or less) what we have today:

- several PDFs are hashed and merklized, merkle root is anchored in the blockchain (bitcoin is used but it is blockchain agnostic)
- merkle proofs, txid, etc. are stored in the respective PDF's metadata (using json format)
- PDFs can be disseminated/stored as they always were

Several companies already use PDFs so our solution fits into their workflows seamlessly. They had PDFs, they are anchored, and they still have PDFs**.

A meta-protocol was designed to encode the data stored in the blockchain in a way that allows on-chain issuing/validation and revocation of the credentials. The process is described in detail in: https://ieeexplore.ieee.org/document/8525400<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fieeexplore.ieee.org%2Fdocument%2F8525400&data=04%7C01%7Clrosenth%40adobe.com%7C5c0d311ec8564d3ab9b808d87592ab9e%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637388620093944000%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=vh7Whif%2B5RCZq5AvEs5dmI68V2kFi8l%2F3M7ztZEBkOU%3D&reserved=0>
(I can share if requested... not sure if the above is accessible to all)

An open source implementation (with tools) can be found at: https://github.com/verifiable-pdfs/blockchain-certificates<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fverifiable-pdfs%2Fblockchain-certificates&data=04%7C01%7Clrosenth%40adobe.com%7C5c0d311ec8564d3ab9b808d87592ab9e%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637388620093944000%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=CWWrkf%2FMyjbsWgyg2bjs%2Bc9EzPZ7CuGfYbh0IcRukLk%3D&reserved=0>
Validators at: https://github.com/verifiable-pdfs/validator-widget<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fverifiable-pdfs%2Fvalidator-widget&data=04%7C01%7Clrosenth%40adobe.com%7C5c0d311ec8564d3ab9b808d87592ab9e%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637388620093953999%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=hjPFeYBqaCj1UNpho%2B0WZxNrH3XfC%2FtVAlBPhtzuCrQ%3D&reserved=0>  and  https://github.com/verifiable-pdfs/blockchain-certificates-validation<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fverifiable-pdfs%2Fblockchain-certificates-validation&data=04%7C01%7Clrosenth%40adobe.com%7C5c0d311ec8564d3ab9b808d87592ab9e%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637388620093953999%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Cqq10SygqaGtlbk0kneOXujNmlc0eCZBWE2ZLcbB%2FXE%3D&reserved=0>

From 2017 all of the credentials and diplomas issued by the University are issued using this platform. From 2019 the solution was commercialized (Block.co) to make it easier for potential clients (abstracting blockchain infrastructure, etc.). Several more clients/use cases came up rather than only educational credentials.

We have looked into this in the past but it was frozen. Now we are again in the process of preparing/designing for VC-compatibility. We already use json format so using the VC format would be straight-forward in itself.

However, I seem to be missing some parts, like how to formally define revocation mechanisms or how to formally include the blockchain proof. Is there a place where we can define our issuing / revocation mechanism so it can be reused/interoperable?

Also it would be of great interest to see how we can properly use XMP that Leonard mentioned in a more standardized way in our solution.

I would be happy to discuss further with anyone interested and thank you all for your time reading this.

Regards,
Kostas

* Not unlike VCs other than the fact that the 'presentation'-layer is the verifiable container itself.
** The solution can apply to any file that supports custom metadata.

--
Konstantinos A. Karasavvas
Software Architect, Blockchain Engineer, Researcher, Educator
https://twitter.com/kkarasavvas<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftwitter.com%2Fkkarasavvas&data=04%7C01%7Clrosenth%40adobe.com%7C5c0d311ec8564d3ab9b808d87592ab9e%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637388620093963991%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=rZyucVKx%2FjqWxcKZZyU6F%2FrAHP0UpYyf7NdloVZs8p4%3D&reserved=0>



--
Konstantinos A. Karasavvas
Software Architect, Blockchain Engineer, Researcher, Educator
https://twitter.com/kkarasavvas<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftwitter.com%2Fkkarasavvas&data=04%7C01%7Clrosenth%40adobe.com%7C5c0d311ec8564d3ab9b808d87592ab9e%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637388620093963991%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=rZyucVKx%2FjqWxcKZZyU6F%2FrAHP0UpYyf7NdloVZs8p4%3D&reserved=0>
Received on Wednesday, 21 October 2020 13:04:09 UTC

This archive was generated by hypermail 2.4.0 : Thursday, 24 March 2022 20:25:04 UTC