- From: Dan Brickley <danbri@google.com>
- Date: Tue, 4 May 2021 16:55:37 +0100
- To: Manu Sporny <msporny@digitalbazaar.com>
- Cc: Phil Archer <phil.archer@gs1.org>, Ivan Herman <ivan@w3.org>, Dan Brickley <danbri@danbri.org>, Aidan Hogan <aidhog@gmail.com>, Pierre-Antoine Champin <pierre-antoine@w3.org>, Ramanathan Guha <guha@google.com>, semantic-web <semantic-web@w3.org>
- Message-ID: <CAK-qy=59A-5uZ88DiOPF+95qEqSc3kLTYCfnnT1XnhmGcU6SQA@mail.gmail.com>
On Tue, 4 May 2021 at 15:40, Manu Sporny <msporny@digitalbazaar.com> wrote: > > On 5/4/21 10:01 AM, Dan Brickley wrote: > > For now I'd just add: let's not wait until the WG is chartered before > > clarifying usecases - the lack of these may be why there's apparently > > disagreement amongst the works primary advocates on what is in vs out of > > scope. > > Dan, have you seen the current set of use cases? > > https://w3c.github.io/lds-wg-charter/explainer.html#usage Yes. My concern in the original post was that: *The charter opens as follows:* *“ There are a variety of established use cases, such as Verifiable Credentials <https://www.w3.org/TR/vc-data-model>, the publication of biological and pharmaceutical data, consumption of mission critical RDF vocabularies, and others, that depend on the ability to verify the authenticity and integrity of the data being consumed (see the use cases <https://w3c.github.io/lds-wg-charter/explainer.html#usage> for more examples).”* *Currently the charter only alludes wavily to a “variety of established use cases”, and cites its specific “use cases” for “more”.* ... i.e. those that you're pointing to are additional to presumed widely known usecases, ... they're "more", not the core. The first sentence of the charter grounds its importance in terms of "The deployment of Linked Data is increasing at a rapid pace.", and we understand from Ivan that this means the same as The deployment of RDF is increasing at a rapid pace". It links to http://webdatacommons.org/structureddata/#toc3 which is about "Microdata, RDFa, JSON-LD, and Microformat Data Sets", from public web crawl extractions by the webdatacommons team. The charter talks about "Detecting changes in datasets" as a typical usecase. It would be good to tie that to any of the "increasing at a rapid pace" adoption reported in http://webdatacommons.org/structureddata/. Consider that for the GS1-related / Product data usecases, Phil seems to see things differently from Manu. Phil: "Where I think I seem to have more sympathy than some with Dan's original commentary, is the issue of a fixed/signed dataset containing links to external sources of data and definitions that are not under the signee's control. That is, if my signed RDF dataset includes data expressed using schema:Product, and the definition of schema:Product changes, what value does my signature have now? This is an issue that I think the WG will need to address - that is, we'll need to set a boundary on what should and should not be inferred by the presence of whatever crypto doo-hickey surrounds the data. IMO, it seems clear that we cannot sign the meaning. ... And there's the irony. We can't sign the semantics in a Semantic Web dataset unless we also retrieve all externally-referenced sources and sign an immutable local copy of those as well (I'm really hoping no one thinks that's a good idea ☹ )" Manu: [responding to Dan saying]"> Are we convinced that there is application-level value in having assurances over instance data without also having them for the schemas and ontologies they are underpinned by?" Manu: Yes, I am. Much of the work in Verifiable Credentials utilize schemas that are cached client-side (usually permanently, and enforced by software). We don't need schemas to adopt the technology for it to be useful. It would be more useful if schema publishing used the technologies, but I don't think anyone is placing that as a MUST along this road (because there is no need to create a dependency there)." I am sympathetic to Manu's point that it might take years to see how signing plays out w.r.t. schemas and remote dependencies, and hopefully there is at least some usefulness in having some more building blocks for signed RDF in the meantime. Manu - do you have more pointers to the "schemas cached client-side" approach that's emerging? Is it documented anywhere? As Phil says, " if my signed RDF dataset includes data expressed using schema:Product, and the definition of schema:Product changes, what value does my signature have now?". Given that the schema speaks also of "the publication of biological and pharmaceutical data", it would be good to have an explicit usecase from that world, and to work through this issue in that domain. If schema caching and/or signing isn't a concern, that would be good to know. If there are emerging practices, that would also be good to know. The most obvious topic here would be the application of Verifiable Claims to Covid-related "passports", with vaccination records etc. I understand VC is being used in that setting. Is VC for covid vaccination (etc.) blocked in any way by the absence of the proposed work items in this group? Can a usecase be articulated? > > ------------------------ > > Speaking as one of the Editors of the input specifiations... As a related > aside, and at the risk of completely derailing this thread, it is possible to > use the Linked Data Signatures specification to sign data payloads that are > Linked Data but are not RDF. Ivan wrote: "I would propose to agree that, for the purpose of this charter and WG, the terms RDF and Linked Data are interchangeable; this is certainly the way the WG intends to pursue its work." I am glad we're having this conversation, because it is good to stabilize some terminology (at least in the purpose of this charter/WG, as Ivan says), rather than have the WG be launched on the basis of confusions. I am having a hard time imagining how "...that are Linked Data but are not RDF" and "the terms RDF and Linked Data are interchangeable" can be simultaneously true; could we walk through an example in the context of this charter? Ivan also wrote, "To further narrow down the discussion, let us also concentrate on what this charter proposes to do. It proposes to provide a standard for the canonicalization of, and to calculate a hash for, an RDF Graph or an RDF Dataset. (There are some additional, say, "engineering" issues like how to express the algorithms and their result in RDF, but that is, comparatively, minor.) That is it." If the "Linked Data Signatures specification" is expected to create new W3C technology that is likely applicable outside of RDF, charter reviewers ought to know about it. Keeping the gap between the RDF world and everyone else as small as possible makes a lot of sense. The most obviously applicable "not an RDF file" artifact we could consider here is out-of-band JSON-LD context definition files. For example, editing Schema.org's can cause an unchanged installation of Apache Jena to give different RDF output from byte-for-byte identical input. But there may also be use cases that are implementable without the RDF content being canonicalized, or with the canonicalization being at a different level of abstraction (e.g. RDFa-in-HTML content using HTML-level canonicalization). There may be important cases where the OWL level of abstraction is seen as important by some constituencies. > The Linked Data Signatures signing algorithm consists of 4 phases: > > 1. Canonicalization of input data > 2. Cryptographic hashing > 3. Digitally signing > 4. Expressing the signature > > RDF really only comes into play in steps #1 and #4... and it's possible for it > to not come into play at all. > > For example, you can use JCS[1] to canonicalize in step #1, and simple > key-values to express the signature in #4. Workday and Microsoft do this today > with one of their Linked Data Cryptosuites. > > Now, do I think this is a good idea -- no, I'm not too keen on it; but > enabling others to put forward alternatives based upon a standard is useful. > > Should the WG prioritize this aspect of Linked Data Signatures -- no, we > should get the RDF bits right. > > This is why we chose the "Linked Data" moniker... because it's not entirely > about RDF... we have folks that don't like RDF that do use JSON-LD (and seem > to like it). Are the folks that don't like RDF expecting to join this WG that is according to Ivan, entirely devoted to RDF? Saying that the output of the WG is *only* about RDF would > alienate a significant part of that community... and it would also be > technically incorrect. > > Now, all that said -- we should have a razor sharp focus on getting the RDF > bits right, because that's what most of the supporters of the Charter need. > Simultaneously, we shouldn't do anything to prevent these non-RDF (but still > "Linked Data") use cases... and that's the concern w/ stripping all the > "Linked Data" language out of the charter. +1 > It does feel like we're all on the same page here wrt. focus -- we don't want > a perma-WG... we want something specific that's highly focused. Yup - totally agree. > Simultaneously, we don't want the future non-RDF stuff to suffer just because > people were under the mistaken impression that Linked Data Signatures ONLY > works for RDF inputs. I am torn --- as an RDF technologist, absolutely I see value in having common infrastructure around bnode labeling. And that can be useful without any crypto whatsoever, e.g. as utility functions in software it would be handy. Mixed with crypto it absolutely is interesting, but is there perhaps a piece of work that might be harder because it engages with more groups, which pushes the non-RDF aspects of what's proposed here into a broader W3C space? How far can an RDF-agnostic "just sign the bits" approach be made to work for the usecases W3C cares most about? I remember you were keeping an eye on the debates around "Signed HTTP Exchanges" and Web Packaging, for example. Last I checked in there it wasn't clear there was consensus about browser-UI aspects, but maybe there could be some other common agendas worth exploring? https://github.com/w3c/strategy/issues/171#issuecomment-603280405 etc. cheers, Dan > -- manu > > [1]https://tools.ietf.org/html/rfc8785 > > -- > Manu Sporny - https://www.linkedin.com/in/manusporny/ > Founder/CEO - Digital Bazaar, Inc. > blog: Veres One Decentralized Identifier Blockchain Launches > https://tinyurl.com/veres-one-launches >
Received on Tuesday, 4 May 2021 15:56:30 UTC