Re: Chartering work has started for a Linked Data Signature Working Group @W3C from Manu Sporny on 2021-05-06 (semantic-web@w3.org from May 2021)

From: Manu Sporny <msporny@digitalbazaar.com>
Date: Wed, 5 May 2021 23:43:44 -0400
To: Dan Brickley <danbri@google.com>
Cc: Phil Archer <phil.archer@gs1.org>, Ivan Herman <ivan@w3.org>, Dan Brickley <danbri@danbri.org>, Aidan Hogan <aidhog@gmail.com>, Pierre-Antoine Champin <pierre-antoine@w3.org>, Ramanathan Guha <guha@google.com>, semantic-web <semantic-web@w3.org>
Message-ID: <f3abe222-2212-9580-8e8c-92229dc3df8a@digitalbazaar.com>
On 5/4/21 11:55 AM, Dan Brickley wrote:
> The first sentence of the charter grounds its importance in terms of 
> "The deployment of Linked Data is increasing at a rapid pace.",

I'm wondering if it would just be best to cut out some of the more 
marketing related language in the document... we probably don't need to 
sell this community on the fact that people are using more RDF out there 
as the years roll on, thanks to schema.org, healthcare, and now the 
Verifiable Credentials and Decentralized Identifiers work.

If we tone that language down, Dan... and point to, I don't know, press 
releases of federal governments funding cohorts of companies to do 
Verifiable Credentials to replace current paper credentials, would that 
be easier to digest? Things like this:

https://docs.google.com/presentation/d/1MeeP7vDXb9CpSBfjTybYbo8qJfrrbrXCSJa0DklNe2k/edit

That's the US Federal government running interop plugfests using W3C 
standards (RDF, JSON-LD) and the stuff we need to standardize (RDC, LDS, 
etc.). Would that be a better demonstration of adoption and need, Dan?

> Consider that for the GS1-related / Product data usecases, Phil seems to 
> see things differently from Manu.

No, I think Phil and my comments were very aligned. You seem to have 
copied a part of the conversation that I had no intention of relating to 
what Phil said.

We need to understand how to protect schemas... yes. That is not our top 
priority, though... we need to get RDF Dataset Canonicalization done 
first... and then the Linked Data Integrity/Signature stuff... and THEN 
we can have a conversation about protecting schemas because we'll have 
the pieces in place to finally do that.

The same for the "where does what we sign end?" question. My hope is 
that we'll be focused and sane about it and go... well, it's whatever 
was fed in as input to the algorithms. This is usually a fairly 
self-contained document... yes, you need to pay attention to what you're 
signing, but that's more of an implementation guide thing. That 
discussion could wrap the WG around the axle, so we need to be fairly 
stern about priorities and order of standardization.

> I am sympathetic to Manu's point that it might take years to see how 
> signing plays out w.r.t. schemas and remote dependencies, and hopefully 
> there is at least some usefulness in having some more building blocks 
> for signed RDF in the meantime. Manu - do you have more pointers to the 
> "schemas cached client-side" approach that's emerging? Is it documented 
> anywhere?

The work is still a bit early days... I'd say we're entering the 2nd 
generation of approaches after getting some field testing in over the 
past 5+ years. In JSON-LD land, we tend to use these things called 
"Document Loaders" -- which are things that are used to load JSON-LD 
Contexts, and DID Documents, and other things that need to be fetched 
from the network from time to time. They cache aggressively, and in many 
cases... especially in use cases that require high security or 
cryptographic operations, we NEVER load from the network... but instead 
use vetted software packages that always load (for example) the JSON-LD 
Context from disk. Here's an example of a JSON-LD Context that is used 
by these "Document Loaders" when generating and verifying Linked Data 
Signatures:

https://github.com/digitalbazaar/ed25519-signature-2020-context

and an example of it being used with software to generate a Linked Data 
Signature:

https://github.com/digitalbazaar/vc-js#setting-up-a-signature-suite

The signature suite above never loads that context from the network... 
it always loads it from disk. In the future it /could/ load it from the 
network and check a signature... but really, that feels a bit Rube 
Goldbergian -- just load it from disk and you know what version of the 
context you're working with.

> The most 
> obvious topic here would be the application of Verifiable Claims to 
> Covid-related "passports", with vaccination records etc. I understand VC 
> is being used in that setting. Is VC for covid vaccination (etc.) 
> blocked in any way by the absence of the proposed work items in this 
> group? Can a usecase be articulated?

I think you mean "the most politically radioactive topic"... :P

Here is the result of a Vaccination Credential test suite showing NxN 
interop matricies ... all companies are using RDF Dataset Normalization, 
Linked Data Signatures, etc... this is work that some of the folks 
involved in the World Health Organization initiative is doing:

https://w3id.org/vaccination/interop-reports#Polio

Is the work blocked? No, it's worse, everyone is just barreling ahead 
assuming W3C is going to do the Linked Data Signatures work... because 
/obviously/ they would right? :P

> I am glad we're having this conversation, because it is good to 
> stabilize some terminology (at least in the purpose of this charter/WG, 
> as Ivan says), rather than have the WG be launched on the basis of 
> confusions.

One of the elephants in the room is that the definition of "Linked Data" 
has changed over the years, and people keep pointing back to the old 
stuff... while companies are selling the new stuff. It might help to 
just define what Linked Data is in the explainer document so we're all 
working from the same definition and not some old link on the Web.

> I am having a hard time imagining how "...that are Linked Data but are 
> not RDF" and "the terms RDF and Linked Data are interchangeable" can be 
> simultaneously true; could we walk through an example in the context of 
> this charter?

I provided an example to Peter, I hope that helps.

> If the "Linked Data Signatures specification" is expected to create new 
> W3C technology that is likely applicable outside of RDF, charter 
> reviewers ought to know about it.

I don't think we're going to directly create something that is "outside 
of RDF" in the first charter... but we should be aware enough about it 
to document that this is possible and that extension point is by design. 
Perhaps through a NOTE of some sort?

> But there may also be use cases that are implementable without the RDF 
> content being canonicalized, or with the canonicalization being at a 
> different level of abstraction (e.g. RDFa-in-HTML content using 
> HTML-level canonicalization). There may be important cases where the OWL 
> level of abstraction is seen as important by some constituencies.

Yes, that's exactly the use case that I want to make sure that we don't 
accidentally extinguish. We want to make the jump to RDF easier, not harder.

> Are the folks that don't like RDF expecting to join this WG that is 
> according to Ivan, entirely devoted to RDF?

I expect Microsoft might join to protect their investment in JOSE... 
there is a Linked Data Suite that uses JOSE. Last time I checked, 
Microsoft was not keen on RDF, and we've tried to bring them to the 
table repeatedly by providing affordances in Linked Data Signatures that 
didn't burn that bridge.

> I am torn --- as an RDF technologist, absolutely I see value in having 
> common infrastructure around bnode labeling. And that can be useful 
> without any crypto whatsoever, e.g. as utility functions in software it 
> would be handy. Mixed with crypto it absolutely is interesting, but is 
> there perhaps a piece of work that might be harder because it engages 
> with more groups, which pushes the non-RDF aspects of what's proposed 
> here into a broader W3C space? How far can an RDF-agnostic "just sign 
> the bits" approach be made to work for the usecases W3C cares most about?

My personal opinion is that there are some fairly significant drawbacks 
to the "Just sign the bits" approach... it's the whole reason we have 
put so much effort into the Linked Data Integrity stack... when you have 
serialization-agnostic signatures, you can do some pretty amazing stuff 
(like the way-better-than-DEFLATE compression we get when you do 
semantic compression in CBOR-LD).

https://docs.google.com/presentation/d/1ksh-gUdjJJwDpdleasvs9aRXEmeRvqhkVWqeitx5ZAE/edit

> I remember you were keeping an eye on the debates around "Signed HTTP 
> Exchanges" and Web Packaging, for example. Last I checked in there it 
> wasn't clear there was consensus about browser-UI aspects, but maybe 
> there could be some other common agendas worth exploring? 

I hesitate to bring the browser vendors into an RDF-heavy group... I can 
just see the chairs flying now.

-- manu

-- 
Manu Sporny (skype: msporny, twitter: manusporny)
Founder/CEO - Digital Bazaar, Inc.
blog: Veres One Decentralized Identifier Blockchain Launches
https://tinyurl.com/veres-one-launches
Received on Thursday, 6 May 2021 03:44:06 UTC