- From: Dan Brickley <danbri@danbri.org>
- Date: Mon, 10 May 2021 19:38:13 +0100
- To: Ivan Herman <ivan@w3.org>
- Cc: Aidan Hogan <aidhog@gmail.com>, Dan Brickley <danbri@google.com>, Manu Sporny <msporny@digitalbazaar.com>, Markus Sabadello <markus@danubetech.com>, Phil Archer <phil.archer@gs1.org>, Pierre-Antoine Champin <pierre-antoine@w3.org>, Ramanathan Guha <guha@google.com>, Wendy Seltzer <wseltzer@w3.org>, semantic-web <semantic-web@w3.org>
- Message-ID: <CAFfrAFpXgHxoCBbyz8FFvAhNuozxsmBrkogZoVsGCa4W-+Z50w@mail.gmail.com>
On Mon, 10 May 2021 at 19:23, Ivan Herman <ivan@w3.org> wrote: > On 10 May 2021, at 18:58, Dan Brickley <Danbri@danbri.org> wrote: > > > Thanks for reworking the docs based on all of the giant discussions! > > On naming and RDFness, nobody is against pragmatism. The problem is that > everyone sees their own preferences as the most pragmatic. > > As you describe it below, W3C here is skating dangerously close to saying > that it is drafting this work in such a way as to mislead the management of > its Member organizations to such an extent that staff would be assigned to > the WG under false pretences, and that a more honestly described workplan > would not garner support. Presumably this also applies to AC approval, > since it is also the management of W3C member orgs being consulted. > > The pragmatic view in my estimation (and potentially Google’s once we have > discussed internally) is that it is better to have these things out in the > open before the WG is spawned rather than bickered over expensively > afterwards. > > Can you be more specific to understand what you would propose (taking also > into account the constraints that I described below)? > When you wrote "the practical reality is that we had feedbacks from people saying their management may not allow them to participate on the working group is it is perceived as being a pure RDF work", while also suggesting the scope is indeed very RDF oriented ("exchange and the integration of simple factual data expressed in RDF."), it feels like a contradiction best resolved in charter-drafting phase, rather than during the WG. Specifically if the WG is in fact very much focussed on doing things with RDF data, anyone (a) staffing it (b) approving the WG charter, ... ought to know that. My proposal is simple: not to pretend it's not RDF-centric when it is, because the pain will only be postponed. Dan > > Quick example to suggest this goes beyond mere naming: > > If the content being signed claims in rdf that > > entityuri1 has prop1 with val2; > and prop2 with val3; > and prop4 with val4... > > RDF goes to extraordinary lengths to make these different triples > independent. If you assert them all, you are hardpressed to say “hey it was > all or nothing”. Whereas if you operating at the JSON level and sign this > you could point at eg prop4 being “thisRecordTrueUntil” and val4 being > “2021”. > > We have barely touched on how the partial RDFness touches on meaning > attached to signing, is there potential for mixed expectations here? > > > The "out of scope" list in the charter now includes: > > "Authenticity and trust issues of Web (Data) content that go beyond the > exchange and the integration of simple factual data expressed in RDF." > > (I guess you will recognize this text). In my view, this covers the > situation that you describe. Is there anything specific that you could > propose as an additional item in the list? > > In general, it would really be good at this point if we could discuss > specific changes on the documents... > > Thanks > > Cheers, > > Ivan > > > > Dan > > On Mon, 10 May 2021 at 15:08, Ivan Herman <ivan@w3.org> wrote: > >> (This is not a direct reply on this specific message, but I was not sure >> on which message in the thread I should hook this:-) >> >> Dear all, >> >> thanks for all the discussions. We (ie, the the proposed co-chairs of the >> WG, the editors of some of the main input documents, etc) had a series of >> discussions and we have now an updated version of the charter and the >> explainer document: >> >> https://w3c.github.io/lds-wg-charter/ >> https://w3c.github.io/lds-wg-charter/explainer.html >> >> we tried to answer to the concerns expressed on this thread by removing >> some unclear statements, adding some extra explanations to the explainer >> document, putting certain issues explicitly in the 'out-of-scope' sections, >> etc). >> >> On the contentious issue of naming, ie, Linked Data vs. RDF, we have to >> be pragmatic on this. Theoretical purity may require to use only the term >> RDF; the practical reality is that we had feedbacks from people saying >> their management may not allow them to participate on the working group is >> it is perceived as being a pure RDF work but it is o.k. if the work is on >> Linked Data. We have to live with that, and have the naming issue discussed >> on another day. Nevertheless, we tried to come up with a slightly more >> detailed background un the explainer document (rather than the charter >> itself; there is a requirement, by the AC members of the W3C, to keep the >> charter as succinct as possible). >> >> Thanks again for all the input, >> >> Ivan >> >> >> >> >> On 4 May 2021, at 17:55, Dan Brickley <danbri@google.com> wrote: >> >> On Tue, 4 May 2021 at 15:40, Manu Sporny <msporny@digitalbazaar.com> >> wrote: >> > >> > On 5/4/21 10:01 AM, Dan Brickley wrote: >> > > For now I'd just add: let's not wait until the WG is chartered before >> > > clarifying usecases - the lack of these may be why there's apparently >> > > disagreement amongst the works primary advocates on what is in vs out >> of >> > > scope. >> > >> > Dan, have you seen the current set of use cases? >> > >> > https://w3c.github.io/lds-wg-charter/explainer.html#usage >> >> Yes. My concern in the original post was that: >> >> *The charter opens as follows:* >> *“ There are a variety of established use cases, such as Verifiable >> Credentials <https://www.w3.org/TR/vc-data-model>, the publication of >> biological and pharmaceutical data, consumption of mission critical RDF >> vocabularies, and others, that depend on the ability to verify the >> authenticity and integrity of the data being consumed (see the use cases >> <https://w3c.github.io/lds-wg-charter/explainer.html#usage> for more >> examples).”* >> *Currently the charter only alludes wavily to a “variety of established >> use cases”, and cites its specific “use cases” for “more”.* >> >> >> ... i.e. those that you're pointing to are additional to presumed widely >> known usecases, ... they're "more", not the core. >> >> The first sentence of the charter grounds its importance in terms of "The >> deployment of Linked Data is increasing at a rapid pace.", and we >> understand from Ivan that this means the same as The deployment of RDF is >> increasing at a rapid pace". It links to >> http://webdatacommons.org/structureddata/#toc3 which is about >> "Microdata, RDFa, JSON-LD, and Microformat Data Sets", from public web >> crawl extractions by the webdatacommons team. >> >> The charter talks about "Detecting changes in datasets" as a typical >> usecase. It would be good to tie that to any of the "increasing at a rapid >> pace" adoption reported in http://webdatacommons.org/structureddata/. >> >> Consider that for the GS1-related / Product data usecases, Phil seems to >> see things differently from Manu. >> >> Phil: "Where I think I seem to have more sympathy than some with Dan's >> original commentary, is the issue of a fixed/signed dataset containing >> links to external sources of data and definitions that are not under the >> signee's control. That is, if my signed RDF dataset includes data expressed >> using schema:Product, and the definition of schema:Product changes, what >> value does my signature have now? This is an issue that I think the WG will >> need to address - that is, we'll need to set a boundary on what should and >> should not be inferred by the presence of whatever crypto doo-hickey >> surrounds the data. IMO, it seems clear that we cannot sign the meaning. >> ... And there's the irony. We can't sign the semantics in a Semantic Web >> dataset unless we also retrieve all externally-referenced sources and sign >> an immutable local copy of those as well (I'm really hoping no one thinks >> that's a good idea ☹ )" >> >> Manu: [responding to Dan saying]"> Are we convinced that there is >> application-level value in having assurances over instance data without >> also having them for the schemas and ontologies they are underpinned by?" >> >> Manu: Yes, I am. Much of the work in Verifiable Credentials utilize >> schemas that are cached client-side (usually permanently, and enforced >> by software). We don't need schemas to adopt the technology for it to be >> useful. It would be more useful if schema publishing used the >> technologies, but I don't think anyone is placing that as a MUST along >> this road (because there is no need to create a dependency there)." >> >> I am sympathetic to Manu's point that it might take years to see how >> signing plays out w.r.t. schemas and remote dependencies, and hopefully >> there is at least some usefulness in having some more building blocks for >> signed RDF in the meantime. Manu - do you have more pointers to the >> "schemas cached client-side" approach that's emerging? Is it documented >> anywhere? >> >> As Phil says, " if my signed RDF dataset includes data expressed using >> schema:Product, and the definition of schema:Product changes, what value >> does my signature have now?". >> >> Given that the schema speaks also of "the publication of biological and >> pharmaceutical data", it would be good to have an explicit usecase from >> that world, and to work through this issue in that domain. If schema >> caching and/or signing isn't a concern, that would be good to know. If >> there are emerging practices, that would also be good to know. The most >> obvious topic here would be the application of Verifiable Claims to >> Covid-related "passports", with vaccination records etc. I understand VC is >> being used in that setting. Is VC for covid vaccination (etc.) blocked in >> any way by the absence of the proposed work items in this group? Can a >> usecase be articulated? >> >> >> >> > >> > ------------------------ >> > >> > Speaking as one of the Editors of the input specifiations... As a >> related >> > aside, and at the risk of completely derailing this thread, it is >> possible to >> > use the Linked Data Signatures specification to sign data payloads that >> are >> > Linked Data but are not RDF. >> >> >> Ivan wrote: "I would propose to agree that, for the purpose of this >> charter and WG, the terms RDF and Linked Data are interchangeable; this is >> certainly the way the WG intends to pursue its work." >> >> I am glad we're having this conversation, because it is good to stabilize >> some terminology (at least in the purpose of this charter/WG, as Ivan >> says), rather than have the WG be launched on the basis of confusions. >> >> I am having a hard time imagining how "...that are Linked Data but are >> not RDF" and "the terms RDF and Linked Data are interchangeable" can be >> simultaneously true; could we walk through an example in the context of >> this charter? >> >> Ivan also wrote, "To further narrow down the discussion, let us also >> concentrate on what this charter proposes to do. It proposes to provide a >> standard for the canonicalization of, and to calculate a hash for, an RDF >> Graph or an RDF Dataset. (There are some additional, say, "engineering" >> issues like how to express the algorithms and their result in RDF, but that >> is, comparatively, minor.) That is it." >> >> If the "Linked Data Signatures specification" is expected to create new >> W3C technology that is likely applicable outside of RDF, charter reviewers >> ought to know about it. >> >> Keeping the gap between the RDF world and everyone else as small as >> possible makes a lot of sense. >> >> The most obviously applicable "not an RDF file" artifact we could >> consider here is out-of-band JSON-LD context definition files. For example, >> editing Schema.org's can cause an unchanged installation of Apache Jena >> to give different RDF output from byte-for-byte identical input. >> >> But there may also be use cases that are implementable without the RDF >> content being canonicalized, or with the canonicalization being at a >> different level of abstraction (e.g. RDFa-in-HTML content using HTML-level >> canonicalization). There may be important cases where the OWL level of >> abstraction is seen as important by some constituencies. >> >> >> > The Linked Data Signatures signing algorithm consists of 4 phases: >> > >> > 1. Canonicalization of input data >> > 2. Cryptographic hashing >> > 3. Digitally signing >> > 4. Expressing the signature >> > >> > RDF really only comes into play in steps #1 and #4... and it's possible >> for it >> > to not come into play at all. >> > >> > For example, you can use JCS[1] to canonicalize in step #1, and simple >> > key-values to express the signature in #4. Workday and Microsoft do >> this today >> > with one of their Linked Data Cryptosuites. >> > >> > Now, do I think this is a good idea -- no, I'm not too keen on it; but >> > enabling others to put forward alternatives based upon a standard is >> useful. >> > >> > Should the WG prioritize this aspect of Linked Data Signatures -- no, we >> > should get the RDF bits right. >> > >> > This is why we chose the "Linked Data" moniker... because it's not >> entirely >> > about RDF... we have folks that don't like RDF that do use JSON-LD (and >> seem >> > to like it). >> >> Are the folks that don't like RDF expecting to join this WG that is >> according to Ivan, entirely devoted to RDF? >> >> >> Saying that the output of the WG is *only* about RDF would >> > alienate a significant part of that community... and it would also be >> > technically incorrect. >> > >> > Now, all that said -- we should have a razor sharp focus on getting the >> RDF >> > bits right, because that's what most of the supporters of the Charter >> need. >> > Simultaneously, we shouldn't do anything to prevent these non-RDF (but >> still >> > "Linked Data") use cases... and that's the concern w/ stripping all the >> > "Linked Data" language out of the charter. >> >> >> +1 >> >> > It does feel like we're all on the same page here wrt. focus -- we >> don't want >> > a perma-WG... we want something specific that's highly focused. >> >> Yup - totally agree. >> >> > Simultaneously, we don't want the future non-RDF stuff to suffer just >> because >> > people were under the mistaken impression that Linked Data Signatures >> ONLY >> > works for RDF inputs. >> >> I am torn --- as an RDF technologist, absolutely I see value in having >> common infrastructure around bnode labeling. And that can be useful without >> any crypto whatsoever, e.g. as utility functions in software it would be >> handy. Mixed with crypto it absolutely is interesting, but is there perhaps >> a piece of work that might be harder because it engages with more groups, >> which pushes the non-RDF aspects of what's proposed here into a broader W3C >> space? How far can an RDF-agnostic "just sign the bits" approach be made to >> work for the usecases W3C cares most about? >> >> I remember you were keeping an eye on the debates around "Signed HTTP >> Exchanges" and Web Packaging, for example. Last I checked in there it >> wasn't clear there was consensus about browser-UI aspects, but maybe there >> could be some other common agendas worth exploring? >> https://github.com/w3c/strategy/issues/171#issuecomment-603280405 etc. >> >> cheers, >> >> Dan >> >> > -- manu >> > >> > [1]https://tools.ietf.org/html/rfc8785 >> > >> > -- >> > Manu Sporny - https://www.linkedin.com/in/manusporny/ >> > Founder/CEO - Digital Bazaar, Inc. >> > blog: Veres One Decentralized Identifier Blockchain Launches >> > https://tinyurl.com/veres-one-launches >> > >> >> >> >> ---- >> Ivan Herman, W3C >> Home: http://www.w3.org/People/Ivan/ >> mobile: +33 6 52 46 00 43 >> ORCID ID: https://orcid.org/0000-0003-0782-2704 >> >>
Received on Monday, 10 May 2021 18:38:40 UTC