- From: Ivan Herman <ivan@w3.org>
- Date: Tue, 19 Sep 2023 19:57:41 +0200
- To: Dave Longley <dlongley@digitalbazaar.com>
- Cc: Dan Yamamoto <dan@iij.ad.jp>, Manu Sporny <msporny@digitalbazaar.com>, Phil Archer <phil.archer@gs1.org>, Sebastian Crane <seabass-labrax@gmx.com>, Gregg Kellogg <gregg@greggkellogg.net>, RDF Dataset Canonicalization and Hash Working Group <public-rch-wg@w3.org>
- Message-Id: <9F71E8A6-0D1C-4242-96E7-13A1F1CA38C8@w3.org>
> On 19 Sep 2023, at 17:47, Dave Longley <dlongley@digitalbazaar.com> wrote: > > My thoughts: > > We've already expressed an identifier for the RDF Dataset > Canonicalization Algorithm: RDFC-1.0 -- and it uses a default hash > algorithm, SHA-256 internally ... and any other hash algorithm that > could be used with it will similarly have its own identifier > ("SHA-384", "SHAKE256", so on). These things are decoupled from one > another (RDFC-1.0 works with any hash algorithm) and specifying which > of these has been used (if it deviates from the default) seems to be > in the domain of whatever format / standard / etc. is used to express > metadata about either the canonicalized dataset or a hash of it > (which, notably, would further include another hash algorithm which > may or may not be the same). > > I don't think it's a good idea to invent a new hash metadata > expression mechanism in this group. These things exist elsewhere (such > as multihash, or SRI, or RFC 6920) and some of them have their own > registries where this metadata goes and where it is mapped to > identifiers and / or "header values" that work within those specific > formats. That is the right place, IMO, to put this kind of information > and to enable interoperability on processing however it is expressed. I agree. That is essentially what I proposed in my first reaction to Sebastian's mail. Ivan > > The specification we've produced is what enables someone to use > whatever metadata parameters they parse from (or input into) such > expressions to reproduce / verify / etc. some expected value. I see > our specification as being similar to the SHA-256 specification, that > indicates how to produce such a digest, but it does not define a hash > metadata expression format itself. > > On Tue, Sep 19, 2023 at 9:09 AM Dan Yamamoto <dan@iij.ad.jp> wrote: >> >> I also probably share the same opinion with Ivan. Since RDFC-1.0 isn't >> always used alongside Data Integrity, I thought it would be better for >> it to have some precise algorithm identifier on its own. >> >> Dan >> >> On 2023/09/19 0:49, Ivan Herman wrote: >>> >>> >>>> On 18 Sep 2023, at 17:26, Manu Sporny <msporny@digitalbazaar.com> wrote: >>>> >>>> On Mon, Sep 18, 2023 at 11:15 AM Phil Archer <phil.archer@gs1.org> wrote: >>>>> From: Dan Yamamoto <dan@iij.ad.jp> >>>>> Therefore, I believe the internal hash function should be >>>>> interchangeable. However, as others have suggested, I think there is >>>>> a need to introduce a mechanism to specify what hash function is used >>>>> explicitly. >>>> >>>> Just to jump in quickly on this thread; it feels like the harms are >>>> being exaggerated given the way we know that RDFC-1.0 is used today. >>>> If we look at how the VC Data Integrity specifications use the >>>> algorithm, you /always/ know which internal hash algorithm was used >>>> (or should be used) because it's signalled to you via the Data >>>> Integrity algorithm identifier. You don't have to guess, you are told >>>> exactly which internal hash algorithm to use. >>>> >>>> I wonder if folks are missing this detail? It was always expected that >>>> the internal hash information would be signalled to the caller, and >>>> that's exactly what Data Integrity does. Perhaps all we need to do in >>>> the spec is ensure that one of the outputs is the internal hash >>>> function used and to tell spec writers that use RDFC-1.0 that any >>>> algorithm that uses it needs to clearly stipulate which internal >>>> algorithm to use when calling the algorithm (and if not, the default >>>> will be used)? >>>> >>> >>> I do not think the issue is with spec writers. RDFC-1.0 is meant for any >>> lambda users of Linked Data, not only for spec writers. While what you >>> say is o.k., what we need is a way to convey the information of what >>> hash function was used when we provide the hash of a specific graph, >>> because that hash may travel from one lambda user to the other. >>> >>> Ivan >>> >>> >>> >>>> This feels more like a miscommunication than a design issue. Does the >>>> above help clarify? >>>> >>>> -- manu >>>> >>>> -- >>>> Manu Sporny - https://www.linkedin.com/in/manusporny/ >>>> Founder/CEO - Digital Bazaar, Inc. >>>> https://www.digitalbazaar.com/ >>>> >>> >>> >>> ---- >>> Ivan Herman, W3C >>> Home: http://www.w3.org/People/Ivan/ >>> mobile: +33 6 52 46 00 43 >>> >>> >> >> -- >> Dan Yamamoto <dan@iij.ad.jp> >> Internet Initiative Japan Inc. >> >> >> > > > -- > > Dave Longley > CTO > Digital Bazaar, Inc. ---- Ivan Herman, W3C Home: http://www.w3.org/People/Ivan/ mobile: +33 6 52 46 00 43
Received on Tuesday, 19 September 2023 17:58:08 UTC