Re: Putting pdf data into VCs from Kristina Yasuda on 2020-11-03 (public-credentials@w3.org from November 2020)

From: Kristina Yasuda <Kristina.Yasuda@microsoft.com>
Date: Tue, 3 Nov 2020 01:39:39 +0000
To: Adrian Gropper <agropper@healthurl.com>, Kostas Karasavvas <kkarasavvas@gmail.com>
CC: "W3C Credentials CG (Public List)" <public-credentials@w3.org>, Zhen Chien Chia <zhchia@microsoft.com>
Message-ID: <HK2P15301MB0018504BF05429581438A6B5E5110@HK2P15301MB0018.APCP153.PROD.OUTLOOK.C>

Hi all, Kostas, Adrian,

Thank you very much for the responses! Was insightful to find out that pdf as a container is a usecase not limited to education sector.

I agree that displaying metadata makes more sense than reproducing a pdf from a VC. With decentralized storages, storing files while putting encryption protection was one of the challenges.

This raises an intersting question on how to integrate VCs with existing data formats (pdf being one) - which to use as a container: VC or existing data format. One option would not be interoperate with another..

Kindest Regards,
Kristina

________________________________
差出人: Adrian Gropper <agropper@healthurl.com>
送信日時: 2020年10月30日 2:46
宛先: Kostas Karasavvas <kkarasavvas@gmail.com>
CC: Kristina Yasuda <Kristina.Yasuda@microsoft.com>; W3C Credentials CG (Public List) <public-credentials@w3.org>; Zhen Chien Chia <zhchia@microsoft.com>
件名: Re: Putting pdf data into VCs

The healthcare use-case is around prescriptions: https://w3c.github.io/did-use-cases/#prescriptions<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fw3c.github.io%2Fdid-use-cases%2F%23prescriptions&data=04%7C01%7CKristina.Yasuda%40microsoft.com%7Cfa70a06ed411487cb4c808d87c3291af%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637395903942187568%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=PLUcmqfZQfUExYO%2B1ZEJVyilAHEcXXYQK9m30daDS3A%3D&reserved=0>

ePrescribing is a major market with tremendous privacy and public health issues and a lot of traffic as well. The current ePrescribing system in the US is inferior to the paper system it replaced.

* It reduces, if not effectively eliminating, the patient's ability to shop around.
* It has also led to a federation that excludes many innovative digital health records solutions.
* The network that manages prescriptions (SureScripts) is opaque and inaccessible to patients
* SureScripts restraint of trade practices is under investigation by the Federal Government.
* SureScripts is, in effect, the identity provider for 330 Million people and is now marketing itself as a data broker beyond just prescriptions.
* ePrescribing of controlled substances (EPCS) is a major application in itself and needs strong, Federal-grade credentials and non-repudiable signatures.

PDF-based ePrescriptions could be designed to fix all of the above. The Free / libre HIE of One Trustee<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fhieofone.com%2F&data=04%7C01%7CKristina.Yasuda%40microsoft.com%7Cfa70a06ed411487cb4c808d87c3291af%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637395903942197561%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=BtHb7JtIBpbp8qa2mpouQjK7AONNMggJOG3bZn84dTY%3D&reserved=0> project uses ePrescribing as a core demonstration. We need help adopting the SSI standards including VC, non-repudiable signatures, and timestamps.

Adrian

On Thu, Oct 29, 2020 at 5:57 AM Kostas Karasavvas <kkarasavvas@gmail.com<mailto:kkarasavvas@gmail.com>> wrote:
Hi Kristina / all,

First off, to my knowledge there is no standardized way to do this.

Indeed this is different from using the PDFs as the container. This is something we have been thinking and although I don't have any answers, here are my thoughts.

1) The institutions that provide the PDFs could also provide the machine-readable metadata (this is what we optionally do with the PDF container as well). There are decentralized storage solutions that won't introduce a centralized PoF (with the assumption that the storage solution is self-sustainable to guarantee long-term usage) so the pointer idea could be feasible without compromising decentralization. Although less practical, several such solutions could be combined for an even more resilient solution. We have frozen investigations on this though. Did not seem that important for the clients.

2) Yes, this can be done but the PDF won't be identical to the PDF that the institutions provide unless they are constructed in an identical way. This is not practical since each institution uses their own methods for that and a lot just scan the physical degrees! If they are not identical I don't see the point of creating the PDF in the first place. Why not display the metadata in HTML?

That is one of the reasons we opted for the PDF as a container. It represented a digital version of their physical degrees; and a lot of these institutions were issuing the PDF degrees anyway. The physical degrees typically have several mechanisms to secure the document validity (like holograms). For the digital version we anchor it into a blockchain. When validating the PDF you will see the PDF (identical to the physical one) and the blockchain verification details. This approach has the benefit that the integrity of the exact PDF document that the institution has created is becoming tamperproof. The issuing institutions do what they always did wrt storing/disseminating their certificates (with potential improvements).

We add certain information into the PDF to achieve this. The PDF itself (the visual representation) becomes a self-verifiable document and while we currently use an adhoc json format we do intend to move towards VCs. The idea is that a (universal) VC wallet would look for the VC in the metadata of the PDF and validate it accordingly.

Finally, what one could also do is include a base64 (or equiv.) of the whole PDF inside the VC. Thus, VC is the container and a specific visual representation is attached. I believe this beats the purpose of having a PDF in the first place but I would be open to counter arguments. Maybe it will work in your use case but from my experience that would be contrary to the workflow that these institutions typically have.

Hope the above are of some help,
Kostas

On Wed, Oct 28, 2020 at 9:03 PM Kristina Yasuda <Kristina.Yasuda@microsoft.com<mailto:Kristina.Yasuda@microsoft.com>> wrote:
Hi all!

Many educational institutions issue credentials (transcripts, graduation certificates, etc.) in pdf format, and we have faced an question of how to create VCs using claims in pdfs. Reaching out to the community is anyone has faced a similar issue and if there are standardized way/best practices to:
1) put data from pdfs into VCs while keeping integrity without relian on centralized party? One option could be storing pdf files in centralized/decentralized servers and including a pointer to a file in a VC, but that would introduce a certain level of centralization.
2) to reconstruct pdfs from claims in VCs?

For example Modeling Educational Verifiable Credentials report (https://w3c-ccg.github.io/vc-ed-models/#biblio-obs-are-vcs<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fw3c-ccg.github.io%2Fvc-ed-models%2F%23biblio-obs-are-vcs&data=04%7C01%7CKristina.Yasuda%40microsoft.com%7Cfa70a06ed411487cb4c808d87c3291af%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637395903942197561%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=zH2EnH8OPUW6m2bFKXkif1pl%2F9E1eCyCN2FffaXXIK8%3D&reserved=0>) in section 1.5.3.1 shows the example of including pdf format in a VC, but how can the verifier reproduce a pdf record from the set of values in payload.data?

This is a little different from using pdfs as a container, rather including information from pdfs into VCs.

Thank you very much!
Kristina

--
Konstantinos A. Karasavvas
Software Architect, Blockchain Engineer, Researcher, Educator
https://twitter.com/kkarasavvas<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftwitter.com%2Fkkarasavvas&data=04%7C01%7CKristina.Yasuda%40microsoft.com%7Cfa70a06ed411487cb4c808d87c3291af%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637395903942207558%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=g40BnUQeXVcIrKzHaCiAViHN%2BazwGB3MGfpOEsF80MA%3D&reserved=0>

Received on Tuesday, 3 November 2020 01:40:01 UTC