- From: Marcel Fröhlich <marcel.frohlich@gmail.com>
- Date: Tue, 11 May 2021 14:24:15 +0200
- To: Dan Brickley <danbri@danbri.org>
- Cc: Hugh Glaser <hugh@glasers.org>, Ivan Herman <ivan@w3.org>, lars.svensson@web.de, semantic-web <semantic-web@w3.org>
- Message-ID: <CAHKA4Ly3b4gHm9s1fuw8x+-9XDtQ5vV1H=8dHTvaX_0fwrofhQ@mail.gmail.com>
To cover more relevant use cases, it might be useful to invite someone from IDSA (International Data Space Association, https://internationaldataspaces.org/ ). They build an infrastructure with an elaborate set of participant roles for automatically discovering, negotiating and sharing data. Being able to sign data will be definitely a relevant and useful mechanism in this context. Here is a one-page overview of a data space a la IDSA. https://internationaldataspaces.org/wp-content/uploads/IDSA-Infographic-Data-Sharing-in-a-Data-Space.pdf Cheers, Marcel Am Di., 11. Mai 2021 um 13:42 Uhr schrieb Dan Brickley <danbri@danbri.org>: > > > On Tue, 11 May 2021 at 11:38, Hugh Glaser <hugh@glasers.org> wrote: > >> I think Lars was making a much simpler point, but am likely to be wrong. >> :-) >> > > Yes! I often make the same point and tend to default to “claim” instead of > “statement”; but then who or what is making the claim. In a schema.org > setting it often works to view the claims as being made in-or-by the > containing page, with further anchoring to humans or orgs being left for > others to investigate. > > >> Surely none of this is about anything agreed to be "factual" (the OED >> says a "fact" is "a thing that is known or proved to be true.") >> And saying the RDF being signed is factual takes us down a bad road of an >> implication that because something is signed, it has some inherent truth >> property. > > > Yep - I use “factual data” sometimes as a shorthand for “the kind of data > that expresses facts”. Propositional is another nearby word ( > > https://www.britannica.com/topic/epistemology/The-other-minds-problem#ref848894) > but let’s not go there. > > > > It is seductive, and I see Phil says "At the time we signed it, the >> statements were true". >> I would have thought it was more like "At the time we signed it, the >> statements were what we wanted to sign." > > > Signers will want to know what the W3C specs will imply about their > relationship to the signed material... > > Dan > > >> So "simple statements expressed in RDF" seems more accurate to me. >> >> Best >> Hugh >> >> > On 11 May 2021, at 10:14, Marcel Fröhlich <marcel.frohlich@gmail.com> >> wrote: >> > >> > >> > >> > Am Di., 11. Mai 2021 um 10:48 Uhr schrieb Dan Brickley < >> danbri@danbri.org>: >> > On Tue, 11 May 2021 at 09:29, <lars.svensson@web.de> wrote: >> > (Trimming cc...) >> > >> > Dear all, >> > >> > very interesting discussion! >> > >> > Maybe I'm nitpicking too much, but IMHO the expression "simple factual >> data expressed in RDF" is incorrect. RDF does not express facts but >> statements (facts are true, statements may or may not be true, depending on >> your POV). >> > >> > I suggest to replace that by "simple statements expressed in RDF". >> > >> > I am sympathetic- but this now gets to the heart of the matter. As >> factual data they are state-able, but to claim or state them, we need a >> state-er. How is the party making the statement related to the party >> signing the rdf or dataset? Even the former is nuanced, but rdf datasets >> give an additional level of indirection. >> > >> > >> > +1 >> > >> > Yes, ideally there should be a separate part with statements that >> explicitly clarify, which claims about the data the signature subscribes >> to. I.e. the relationship between signatory and data. >> > >> > Best, Marcel >> > >> > >> > Best, >> > >> > Lars >> > >> > >> > >> > Gesendet: Montag, 10. Mai 2021 um 20:23 Uhr >> > Von: "Ivan Herman" <ivan@w3.org> >> > An: "Dan Brickley" <Danbri@danbri.org> >> > Cc: "Aidan Hogan" <aidhog@gmail.com>, "Dan Brickley" <danbri@google.com>, >> "Manu Sporny" <msporny@digitalbazaar.com>, "Markus Sabadello" < >> markus@danubetech.com>, "Phil Archer" <phil.archer@gs1.org>, >> "Pierre-Antoine Champin" <pierre-antoine@w3.org>, "Ramanathan Guha" < >> guha@google.com>, "Wendy Seltzer" <wseltzer@w3.org>, "semantic-web" < >> semantic-web@w3.org> >> > Betreff: Re: Chartering work has started for a Linked Data Signature >> Working Group @W3C >> > Hi Dan, >> > >> > —— >> > Ivan Herman >> > >> > (Written on my iPad. Excuses for brevity and misspellings...) >> > >> > On 10 May 2021, at 18:58, Dan Brickley <Danbri@danbri.org> wrote: >> > >> > >> > Thanks for reworking the docs based on all of the giant discussions! >> > >> > On naming and RDFness, nobody is against pragmatism. The problem is >> that everyone sees their own preferences as the most pragmatic. >> > >> > As you describe it below, W3C here is skating dangerously close to >> saying that it is drafting this work in such a way as to mislead the >> management of its Member organizations to such an extent that staff would >> be assigned to the WG under false pretences, and that a more honestly >> described workplan would not garner support. Presumably this also applies >> to AC approval, since it is also the management of W3C member orgs being >> consulted. >> > >> > The pragmatic view in my estimation (and potentially Google’s once we >> have discussed internally) is that it is better to have these things out in >> the open before the WG is spawned rather than bickered over expensively >> afterwards. >> > >> > >> > Can you be more specific to understand what you would propose (taking >> also into account the constraints that I described below)? >> > >> > Quick example to suggest this goes beyond mere naming: >> > >> > If the content being signed claims in rdf that >> > >> > entityuri1 has prop1 with val2; >> > and prop2 with val3; >> > and prop4 with val4... >> > >> > RDF goes to extraordinary lengths to make these different triples >> independent. If you assert them all, you are hardpressed to say “hey it was >> all or nothing”. Whereas if you operating at the JSON level and sign this >> you could point at eg prop4 being “thisRecordTrueUntil” and val4 being >> “2021”. >> > >> > We have barely touched on how the partial RDFness touches on meaning >> attached to signing, is there potential for mixed expectations here? >> > >> > The "out of scope" list in the charter now includes: >> > >> > "Authenticity and trust issues of Web (Data) content that go beyond the >> exchange and the integration of simple factual data expressed in RDF." >> > >> > (I guess you will recognize this text). In my view, this covers the >> situation that you describe. Is there anything specific that you could >> propose as an additional item in the list? >> > >> > In general, it would really be good at this point if we could discuss >> specific changes on the documents... >> > >> > Thanks >> > >> > Cheers, >> > >> > Ivan >> > >> > >> > >> > Dan >> > >> > On Mon, 10 May 2021 at 15:08, Ivan Herman <ivan@w3.org> wrote: >> > (This is not a direct reply on this specific message, but I was not >> sure on which message in the thread I should hook this:-) >> > >> > Dear all, >> > >> > thanks for all the discussions. We (ie, the the proposed co-chairs of >> the WG, the editors of some of the main input documents, etc) had a series >> of discussions and we have now an updated version of the charter and the >> explainer document: >> > >> > https://w3c.github.io/lds-wg-charter/ >> > https://w3c.github.io/lds-wg-charter/explainer.html >> > >> > we tried to answer to the concerns expressed on this thread by removing >> some unclear statements, adding some extra explanations to the explainer >> document, putting certain issues explicitly in the 'out-of-scope' sections, >> etc). >> > >> > On the contentious issue of naming, ie, Linked Data vs. RDF, we have to >> be pragmatic on this. Theoretical purity may require to use only the term >> RDF; the practical reality is that we had feedbacks from people saying >> their management may not allow them to participate on the working group is >> it is perceived as being a pure RDF work but it is o.k. if the work is on >> Linked Data. We have to live with that, and have the naming issue discussed >> on another day. Nevertheless, we tried to come up with a slightly more >> detailed background un the explainer document (rather than the charter >> itself; there is a requirement, by the AC members of the W3C, to keep the >> charter as succinct as possible). >> > >> > Thanks again for all the input, >> > >> > Ivan >> > >> > >> > >> > >> > On 4 May 2021, at 17:55, Dan Brickley <danbri@google.com> wrote: >> > >> > On Tue, 4 May 2021 at 15:40, Manu Sporny <msporny@digitalbazaar.com> >> wrote: >> > > >> > > On 5/4/21 10:01 AM, Dan Brickley wrote: >> > > > For now I'd just add: let's not wait until the WG is chartered >> before >> > > > clarifying usecases - the lack of these may be why there's >> apparently >> > > > disagreement amongst the works primary advocates on what is in vs >> out of >> > > > scope. >> > > >> > > Dan, have you seen the current set of use cases? >> > > >> > > https://w3c.github.io/lds-wg-charter/explainer.html#usage >> > >> > Yes. My concern in the original post was that: >> > >> > The charter opens as follows: >> > “ There are a variety of established use cases, such as Verifiable >> Credentials, the publication of biological and pharmaceutical data, >> consumption of mission critical RDF vocabularies, and others, that depend >> on the ability to verify the authenticity and integrity of the data being >> consumed (see the use cases for more examples).” >> > Currently the charter only alludes wavily to a “variety of established >> use cases”, and cites its specific “use cases” for “more”. >> > >> > >> > ... i.e. those that you're pointing to are additional to presumed >> widely known usecases, ... they're "more", not the core. >> > >> > The first sentence of the charter grounds its importance in terms of >> "The deployment of Linked Data is increasing at a rapid pace.", and we >> understand from Ivan that this means the same as The deployment of RDF is >> increasing at a rapid pace". It links to >> http://webdatacommons.org/structureddata/#toc3 which is about >> "Microdata, RDFa, JSON-LD, and Microformat Data Sets", from public web >> crawl extractions by the webdatacommons team. >> > >> > The charter talks about "Detecting changes in datasets" as a typical >> usecase. It would be good to tie that to any of the "increasing at a rapid >> pace" adoption reported in http://webdatacommons.org/structureddata/. >> > >> > Consider that for the GS1-related / Product data usecases, Phil seems >> to see things differently from Manu. >> > >> > Phil: "Where I think I seem to have more sympathy than some with Dan's >> original commentary, is the issue of a fixed/signed dataset containing >> links to external sources of data and definitions that are not under the >> signee's control. That is, if my signed RDF dataset includes data expressed >> using schema:Product, and the definition of schema:Product changes, what >> value does my signature have now? This is an issue that I think the WG will >> need to address - that is, we'll need to set a boundary on what should and >> should not be inferred by the presence of whatever crypto doo-hickey >> surrounds the data. IMO, it seems clear that we cannot sign the meaning. >> ... And there's the irony. We can't sign the semantics in a Semantic Web >> dataset unless we also retrieve all externally-referenced sources and sign >> an immutable local copy of those as well (I'm really hoping no one thinks >> that's a good idea ☹ )" >> > >> > Manu: [responding to Dan saying]"> Are we convinced that there is >> application-level value in having assurances over instance data without >> also having them for the schemas and ontologies they are underpinned by?" >> > >> > Manu: Yes, I am. Much of the work in Verifiable Credentials utilize >> schemas that are cached client-side (usually permanently, and enforced by >> software). We don't need schemas to adopt the technology for it to be >> useful. It would be more useful if schema publishing used the technologies, >> but I don't think anyone is placing that as a MUST along this road (because >> there is no need to create a dependency there)." >> > >> > I am sympathetic to Manu's point that it might take years to see how >> signing plays out w.r.t. schemas and remote dependencies, and hopefully >> there is at least some usefulness in having some more building blocks for >> signed RDF in the meantime. Manu - do you have more pointers to the >> "schemas cached client-side" approach that's emerging? Is it documented >> anywhere? >> > >> > As Phil says, " if my signed RDF dataset includes data expressed using >> schema:Product, and the definition of schema:Product changes, what value >> does my signature have now?". >> > >> > Given that the schema speaks also of "the publication of biological and >> pharmaceutical data", it would be good to have an explicit usecase from >> that world, and to work through this issue in that domain. If schema >> caching and/or signing isn't a concern, that would be good to know. If >> there are emerging practices, that would also be good to know. The most >> obvious topic here would be the application of Verifiable Claims to >> Covid-related "passports", with vaccination records etc. I understand VC is >> being used in that setting. Is VC for covid vaccination (etc.) blocked in >> any way by the absence of the proposed work items in this group? Can a >> usecase be articulated? >> > >> > >> > >> > > >> > > ------------------------ >> > > >> > > Speaking as one of the Editors of the input specifiations... As a >> related >> > > aside, and at the risk of completely derailing this thread, it is >> possible to >> > > use the Linked Data Signatures specification to sign data payloads >> that are >> > > Linked Data but are not RDF. >> > >> > >> > Ivan wrote: "I would propose to agree that, for the purpose of this >> charter and WG, the terms RDF and Linked Data are interchangeable; this is >> certainly the way the WG intends to pursue its work." >> > >> > I am glad we're having this conversation, because it is good to >> stabilize some terminology (at least in the purpose of this charter/WG, as >> Ivan says), rather than have the WG be launched on the basis of confusions. >> > >> > I am having a hard time imagining how "...that are Linked Data but are >> not RDF" and "the terms RDF and Linked Data are interchangeable" can be >> simultaneously true; could we walk through an example in the context of >> this charter? >> > >> > Ivan also wrote, "To further narrow down the discussion, let us also >> concentrate on what this charter proposes to do. It proposes to provide a >> standard for the canonicalization of, and to calculate a hash for, an RDF >> Graph or an RDF Dataset. (There are some additional, say, "engineering" >> issues like how to express the algorithms and their result in RDF, but that >> is, comparatively, minor.) That is it." >> > >> > If the "Linked Data Signatures specification" is expected to create new >> W3C technology that is likely applicable outside of RDF, charter reviewers >> ought to know about it. >> > >> > Keeping the gap between the RDF world and everyone else as small as >> possible makes a lot of sense. >> > >> > The most obviously applicable "not an RDF file" artifact we could >> consider here is out-of-band JSON-LD context definition files. For example, >> editing Schema.org's can cause an unchanged installation of Apache Jena to >> give different RDF output from byte-for-byte identical input. >> > >> > But there may also be use cases that are implementable without the RDF >> content being canonicalized, or with the canonicalization being at a >> different level of abstraction (e.g. RDFa-in-HTML content using HTML-level >> canonicalization). There may be important cases where the OWL level of >> abstraction is seen as important by some constituencies. >> > >> > >> > > The Linked Data Signatures signing algorithm consists of 4 phases: >> > > >> > > 1. Canonicalization of input data >> > > 2. Cryptographic hashing >> > > 3. Digitally signing >> > > 4. Expressing the signature >> > > >> > > RDF really only comes into play in steps #1 and #4... and it's >> possible for it >> > > to not come into play at all. >> > > >> > > For example, you can use JCS[1] to canonicalize in step #1, and simple >> > > key-values to express the signature in #4. Workday and Microsoft do >> this today >> > > with one of their Linked Data Cryptosuites. >> > > >> > > Now, do I think this is a good idea -- no, I'm not too keen on it; but >> > > enabling others to put forward alternatives based upon a standard is >> useful. >> > > >> > > Should the WG prioritize this aspect of Linked Data Signatures -- no, >> we >> > > should get the RDF bits right. >> > > >> > > This is why we chose the "Linked Data" moniker... because it's not >> entirely >> > > about RDF... we have folks that don't like RDF that do use JSON-LD >> (and seem >> > > to like it). >> > >> > Are the folks that don't like RDF expecting to join this WG that is >> according to Ivan, entirely devoted to RDF? >> > >> > >> > Saying that the output of the WG is *only* about RDF would >> > > alienate a significant part of that community... and it would also be >> > > technically incorrect. >> > > >> > > Now, all that said -- we should have a razor sharp focus on getting >> the RDF >> > > bits right, because that's what most of the supporters of the Charter >> need. >> > > Simultaneously, we shouldn't do anything to prevent these non-RDF >> (but still >> > > "Linked Data") use cases... and that's the concern w/ stripping all >> the >> > > "Linked Data" language out of the charter. >> > >> > >> > +1 >> > >> > > It does feel like we're all on the same page here wrt. focus -- we >> don't want >> > > a perma-WG... we want something specific that's highly focused. >> > >> > Yup - totally agree. >> > >> > > Simultaneously, we don't want the future non-RDF stuff to suffer just >> because >> > > people were under the mistaken impression that Linked Data Signatures >> ONLY >> > > works for RDF inputs. >> > >> > I am torn --- as an RDF technologist, absolutely I see value in having >> common infrastructure around bnode labeling. And that can be useful without >> any crypto whatsoever, e.g. as utility functions in software it would be >> handy. Mixed with crypto it absolutely is interesting, but is there perhaps >> a piece of work that might be harder because it engages with more groups, >> which pushes the non-RDF aspects of what's proposed here into a broader W3C >> space? How far can an RDF-agnostic "just sign the bits" approach be made to >> work for the usecases W3C cares most about? >> > >> > I remember you were keeping an eye on the debates around "Signed HTTP >> Exchanges" and Web Packaging, for example. Last I checked in there it >> wasn't clear there was consensus about browser-UI aspects, but maybe there >> could be some other common agendas worth exploring? >> https://github.com/w3c/strategy/issues/171#issuecomment-603280405 etc. >> > >> > cheers, >> > >> > Dan >> > >> > > -- manu >> > > >> > > [1]https://tools.ietf.org/html/rfc8785 >> > > >> > > -- >> > > Manu Sporny - https://www.linkedin.com/in/manusporny/ >> > > Founder/CEO - Digital Bazaar, Inc. >> > > blog: Veres One Decentralized Identifier Blockchain Launches >> > > https://tinyurl.com/veres-one-launches >> > > >> > >> > >> > >> > ---- >> > Ivan Herman, W3C >> > Home: http://www.w3.org/People/Ivan/ >> > mobile: +33 6 52 46 00 43 >> > ORCID ID: https://orcid.org/0000-0003-0782-2704 >> >> -- >> Hugh >> 023 8061 5652 >> >> >>
Received on Tuesday, 11 May 2021 12:24:44 UTC