Re: hashlinks vs trusty URIs from Leonard Rosenthol on 2020-06-08 (public-credentials@w3.org from June 2020)

From: Leonard Rosenthol <lrosenth@adobe.com>
Date: Mon, 8 Jun 2020 02:03:53 +0000
To: Kim Hamilton <kimdhamilton@gmail.com>, Manu Sporny <msporny@digitalbazaar.com>
CC: "W3C Credentials CG (Public List)" <public-credentials@w3.org>
Message-ID: <5313D0A4-3C82-4803-B782-6583488A39CD@adobe.com>
> Are there other groups focused on XML/RDF signatures and tooling
>
That’s existing work, that was done back in the early 2010’s.  You should look at XML Signatures - https://www.w3.org/TR/xmldsig-core1/ as the starting point and then XAdES (https://en.wikipedia.org/wiki/XAdES) which is the EU/EiDAS standard that you’ll want to align with.

There was a lot of research back in those days around signing XML/RDF, but I am not aware of any specific standard.  I assume that since there is a canonicalization model, nothing additional was required.


>In theory, it seems like if the signature is computed on the RDF graph, it should preserve across XML/RDF and JSON-LD/RDF,
>but this is an example of something we've been hand-waving about and need to ensure.
>
You need to have a set of bits to hash – so what format is the graph in that you are hashing it?  Is it XML/RDF?  JSON-LD/RDF?  Other?


> •          The EDCI effort is using XML VCs to comply with eIDAS legal signature requirements, but they don't have anything official from w3c to base that on and would like guidance
>
You mean about an XML serialization of the VC data model, correct?   Because all the XML DigSig stuff is based on W3C standards (that are then extended by ETSI standards).

Leonard

From: Kim Hamilton <kimdhamilton@gmail.com>
Date: Sunday, June 7, 2020 at 7:05 PM
To: Manu Sporny <msporny@digitalbazaar.com>
Cc: "W3C Credentials CG (Public List)" <public-credentials@w3.org>
Subject: Re: hashlinks vs trusty URIs
Resent-From: <public-credentials@w3.org>
Resent-Date: Sunday, June 7, 2020 at 7:04 PM

ok, I missed that trusty URIs require an actual transformation in some cases, e.g:

>  To support self-references, i.e. resources that contain their own trusty URI, the generation process involves not just to compute the hash from a given artifact but to actually transform the artifact into a new version that contains the newly generated trusty URI.

That's definitely not desirable. Ivan and Manu -- the approach you describe makes sense and is consistent with the current approach for LD proofs (the canonicalization step). I hadn't seen discussion of it yet, and I'm interested in joining wherever those conversations are happening. I'm also interested in learning more about what Ivan mentioned about XML signatures.

To back up, I don't have one specific question, rather a category of unknowns. Here's the context:

In VC-EDU we have a number of data standards that have historically been written in XML. Since RDF can be serialized as XML, we've not worried too much about the emphasis on JSON-LD (compared to XML) in VCs. But the time for hand-waving around this issue is past, so there are a variety of issues (ranging in depth):

  *   The VC data model lists certain syntaxes (JSON, JSON-LD), and while it's clear that's not meant to be exhaustive, it doesn't have recommendations for adding new syntaxes. Can we just do it? Or do we need to add some sort of extension? We'd like to have certain things hosted as official w3c artifacts (e.g. XML schemas), so maybe we just need to worry about the latter category of artifacts
  *   Are there other groups focused on XML/RDF signatures and tooling (using similar approaches to our JSON-LD proofs)? Basically we want to understand if we should join existing efforts or build something new?
  *   In theory, it seems like if the signature is computed on the RDF graph, it should preserve across XML/RDF and JSON-LD/RDF, but this is an example of something we've been hand-waving about and need to ensure.

     *   The question about hashlinks vs trusty uris was really a rathole on this fork of investigation, so ignore that for now.
Some examples of where these issues are arising:

  *   The EDCI effort is using XML VCs to comply with eIDAS legal signature requirements, but they don't have anything official from w3c to base that on and would like guidance
  *   PESC/XML transcripts are very widely used in North America. They had been focused on mapping to JSON-LD, but would be interested in whether we're providing proper XML support
Maybe a topic for a future CCG call? A lot to unpack here...

Thanks,
Kim

On Sat, Jun 6, 2020 at 7:19 AM Manu Sporny <msporny@digitalbazaar.com<mailto:msporny@digitalbazaar.com>> wrote:
On 6/6/20 3:28 AM, Ivan Herman wrote:
> I would think having a separate vocabulary to make statements like
>
> <graph URI> <:hasHash> "hash value" .

My read on the paper is the same as Ivan's read.

The cleaner solution is to annotate the RDF graph, like the above, and
is effectively what the Linked Data Proof stuff does (as a part of graph
canonicalization).

Modifying the RDF graph or transforming it is what was being done for a
decade+ before Dave Longley invented the generalized solution.
Modification of an RDF graph to hash it has terrible complexity
consequences on software that needs to use the modified graph and
determine if that modified graph is the same one that is sitting on a
local system. In short, you create a very complex transformation and
comparison issue when you modify RDF graphs in order to hash them (or
refer to them using hashes).

To provide an alternative, the RDF Dataset Canonicalization Algorithm
canonicalizes the RDF graph in a way that a hash can be generated for it
without having to modify the original information. That hash could be
paired with hashlinks, but I'm struggling to understand the specific use
case (and don't have the spare cycles to put further thought into it
this moment).

I only had about 15 minutes to read through the Trusty URIs paper (first
time I had heard of it, nice arxiv archeology work!). My take away is
that it does things that are unnecessary with the solutions we have
available to us today. The basic generalized RDF hashing building blocks
are there via the RDF Dataset Canonicalization Algorithm. That hash can
then be used by any technology that can express a hash and metadata
about that hash (Linked Data Proofs/Signatures, Hashlinks, Magnet URIs,
Named Information, etc.)

Understanding the specific use case you're going after might help... or
if you think Trusty URIs can do something that can't be done with the
current generalized tooling we have?

-- manu

--
Manu Sporny - https://www.linkedin.com/in/manusporny/<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linkedin.com%2Fin%2Fmanusporny%2F&data=02%7C01%7Clrosenth%40adobe.com%7Ca24592cdb7784f92032c08d80b3747a1%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637271679342810167&sdata=mab%2BN332z5xSqj78cGUFKxMNV7W8Pv8Mcx3DOU%2F2Ogs%3D&reserved=0>
Founder/CEO - Digital Bazaar, Inc.
blog: Veres One Decentralized Identifier Blockchain Launches
https://tinyurl.com/veres-one-launches<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftinyurl.com%2Fveres-one-launches&data=02%7C01%7Clrosenth%40adobe.com%7Ca24592cdb7784f92032c08d80b3747a1%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637271679342810167&sdata=EDFD%2BZDPHRe4ACm0aaPARzUKYBnpQsbwatzolrycVdQ%3D&reserved=0>
Received on Monday, 8 June 2020 02:04:18 UTC