W3C home > Mailing lists > Public > semantic-web@w3.org > May 2021

Re: Chartering work has started for a Linked Data Signature Working Group @W3C

From: Dan Brickley <danbri@danbri.org>
Date: Tue, 11 May 2021 12:41:56 +0100
Message-ID: <CAFfrAFpAqtXU3OwSv=S2kTUh9KcoMxPW3RH3ob0+MoNs6QPgkQ@mail.gmail.com>
To: Hugh Glaser <hugh@glasers.org>
Cc: Ivan Herman <ivan@w3.org>, Marcel Fröhlich <marcel.frohlich@gmail.com>, lars.svensson@web.de, semantic-web <semantic-web@w3.org>
On Tue, 11 May 2021 at 11:38, Hugh Glaser <hugh@glasers.org> wrote:

> I think Lars was making a much simpler point, but am likely to be wrong.
> :-)
>

Yes! I often make the same point and tend to default to “claim” instead of
“statement”; but then who or what is making the claim. In a schema.org
setting it often works to view the claims as being made in-or-by the
containing page, with further anchoring to humans or orgs being left for
others to investigate.


> Surely none of this is about anything agreed to be "factual" (the OED says
> a "fact" is "a thing that is known or proved to be true.")
> And saying the RDF being signed is factual takes us down a bad road of an
> implication that because something is signed, it has some inherent truth
> property.


Yep - I use “factual data” sometimes as a shorthand for “the kind of data
that expresses facts”. Propositional is another nearby word (
https://www.britannica.com/topic/epistemology/The-other-minds-problem#ref848894)
but let’s not go there.



It is seductive, and I see Phil says "At the time we signed it, the
> statements were true".
> I would have thought it was more like  "At the time we signed it, the
> statements were what we wanted to sign."


Signers will want to know what the W3C specs will imply about their
relationship to the signed material...

Dan


> So "simple statements expressed in RDF" seems more accurate to me.
>
> Best
> Hugh
>
> > On 11 May 2021, at 10:14, Marcel Fröhlich <marcel.frohlich@gmail.com>
> wrote:
> >
> >
> >
> > Am Di., 11. Mai 2021 um 10:48 Uhr schrieb Dan Brickley <
> danbri@danbri.org>:
> > On Tue, 11 May 2021 at 09:29, <lars.svensson@web.de> wrote:
> > (Trimming cc...)
> >
> > Dear all,
> >
> > very interesting discussion!
> >
> > Maybe I'm nitpicking too much, but IMHO the expression "simple factual
> data expressed in RDF" is incorrect. RDF does not express facts but
> statements (facts are true, statements may or may not be true, depending on
> your POV).
> >
> > I suggest to replace that by "simple statements expressed in RDF".
> >
> > I am sympathetic- but this now gets to the heart of the matter. As
> factual data they are state-able, but to claim or state them, we need a
> state-er. How is the party making the statement related to the party
> signing the rdf or dataset? Even the former is nuanced, but rdf datasets
> give an additional level of indirection.
> >
> >
> > +1
> >
> > Yes, ideally there should be a separate part with statements that
> explicitly clarify, which claims about the data the signature subscribes
> to. I.e. the relationship between signatory and data.
> >
> > Best, Marcel
> >
> >
> > Best,
> >
> > Lars
> >
> >
> >
> > Gesendet: Montag, 10. Mai 2021 um 20:23 Uhr
> > Von: "Ivan Herman" <ivan@w3.org>
> > An: "Dan Brickley" <Danbri@danbri.org>
> > Cc: "Aidan Hogan" <aidhog@gmail.com>, "Dan Brickley" <danbri@google.com>,
> "Manu Sporny" <msporny@digitalbazaar.com>, "Markus Sabadello" <
> markus@danubetech.com>, "Phil Archer" <phil.archer@gs1.org>,
> "Pierre-Antoine Champin" <pierre-antoine@w3.org>, "Ramanathan Guha" <
> guha@google.com>, "Wendy Seltzer" <wseltzer@w3.org>, "semantic-web" <
> semantic-web@w3.org>
> > Betreff: Re: Chartering work has started for a Linked Data Signature
> Working Group @W3C
> > Hi Dan,
> >
> > ——
> > Ivan Herman
> >
> > (Written on my iPad. Excuses for brevity and misspellings...)
> >
> > On 10 May 2021, at 18:58, Dan Brickley <Danbri@danbri.org> wrote:
> >
> > 
> > Thanks for reworking the docs based on all of the giant discussions!
> >
> > On naming and RDFness, nobody is against pragmatism. The problem is that
> everyone sees their own preferences as the most pragmatic.
> >
> > As you describe it below, W3C here is skating dangerously close to
> saying that it is drafting this work in such a way as to mislead the
> management of its Member organizations to such an extent that staff would
> be assigned to the WG under false pretences, and that a more honestly
> described workplan would not garner support. Presumably this also applies
> to AC approval, since it is also the management of W3C member orgs being
> consulted.
> >
> > The pragmatic view in my estimation (and potentially Google’s once we
> have discussed internally) is that it is better to have these things out in
> the open before the WG is spawned rather than bickered over expensively
> afterwards.
> >
> >
> > Can you be more specific to understand what you would propose (taking
> also into account the constraints that I described below)?
> >
> > Quick example to suggest this goes beyond mere naming:
> >
> > If the content being signed claims in rdf that
> >
> >  entityuri1 has prop1 with val2;
> >  and prop2 with val3;
> > and prop4 with val4...
> >
> > RDF goes to extraordinary lengths to make these different triples
> independent. If you assert them all, you are hardpressed to say “hey it was
> all or nothing”. Whereas if you operating at the JSON level and sign this
> you could point at eg prop4 being “thisRecordTrueUntil” and val4 being
> “2021”.
> >
> > We have barely touched on how the partial RDFness touches on meaning
> attached to signing, is there potential for mixed expectations here?
> >
> > The "out of scope" list in the charter now includes:
> >
> > "Authenticity and trust issues of Web (Data) content that go beyond the
> exchange and the integration of simple factual data expressed in RDF."
> >
> > (I guess you will recognize this text). In my view, this covers the
> situation that you describe. Is there anything specific that you could
> propose as an additional item in the list?
> >
> > In general, it would really be good at this point if we could discuss
> specific changes on the documents...
> >
> > Thanks
> >
> > Cheers,
> >
> > Ivan
> >
> >
> >
> > Dan
> >
> > On Mon, 10 May 2021 at 15:08, Ivan Herman <ivan@w3.org> wrote:
> > (This is not a direct reply on this specific message, but I was not sure
> on which message in the thread I should hook this:-)
> >
> > Dear all,
> >
> > thanks for all the discussions. We (ie, the the proposed co-chairs of
> the WG, the editors of some of the main input documents, etc) had a series
> of discussions and we have now an updated version of the charter and the
> explainer document:
> >
> > https://w3c.github.io/lds-wg-charter/
> > https://w3c.github.io/lds-wg-charter/explainer.html
> >
> > we tried to answer to the concerns expressed on this thread by removing
> some unclear statements, adding some extra explanations to the explainer
> document, putting certain issues explicitly in the 'out-of-scope' sections,
> etc).
> >
> > On the contentious issue of naming, ie, Linked Data vs. RDF, we have to
> be pragmatic on this. Theoretical purity may require to use only the term
> RDF; the practical reality is that we had feedbacks from people saying
> their management may not allow them to participate on the working group is
> it is perceived as being a pure RDF work but it is o.k. if the work is on
> Linked Data. We have to live with that, and have the naming issue discussed
> on another day. Nevertheless, we tried to come up with a slightly more
> detailed background un the explainer document (rather than the charter
> itself; there is a requirement, by the AC members of the W3C, to keep the
> charter as succinct as possible).
> >
> > Thanks again for all the input,
> >
> > Ivan
> >
> >
> >
> >
> > On 4 May 2021, at 17:55, Dan Brickley <danbri@google.com> wrote:
> >
> > On Tue, 4 May 2021 at 15:40, Manu Sporny <msporny@digitalbazaar.com>
> wrote:
> > >
> > > On 5/4/21 10:01 AM, Dan Brickley wrote:
> > > > For now I'd just add: let's not wait until the WG is chartered before
> > > > clarifying usecases - the lack of these may be why there's apparently
> > > > disagreement amongst the works primary advocates on what is in vs
> out of
> > > > scope.
> > >
> > > Dan, have you seen the current set of use cases?
> > >
> > > https://w3c.github.io/lds-wg-charter/explainer.html#usage
> >
> > Yes. My concern in the original post was that:
> >
> > The charter opens as follows:
> > “ There are a variety of established use cases, such as Verifiable
> Credentials, the publication of biological and pharmaceutical data,
> consumption of mission critical RDF vocabularies, and others, that depend
> on the ability to verify the authenticity and integrity of the data being
> consumed (see the use cases for more examples).”
> > Currently the charter only alludes wavily to a “variety of established
> use cases”, and cites its specific “use cases” for “more”.
> >
> >
> > ... i.e. those that you're pointing to are additional to presumed widely
> known usecases, ... they're "more", not the core.
> >
> > The first sentence of the charter grounds its importance in terms of
> "The deployment of Linked Data is increasing at a rapid pace.", and we
> understand from Ivan that this means the same as The deployment of RDF is
> increasing at a rapid pace". It links to
> http://webdatacommons.org/structureddata/#toc3 which is about "Microdata,
> RDFa, JSON-LD, and Microformat Data Sets", from public web crawl
> extractions by the webdatacommons team.
> >
> > The charter talks about "Detecting changes in datasets" as a typical
> usecase. It would be good to tie that to any of the "increasing at a rapid
> pace" adoption reported in http://webdatacommons.org/structureddata/.
> >
> > Consider that for the GS1-related / Product data usecases, Phil seems to
> see things differently from Manu.
> >
> > Phil: "Where I think I seem to have more sympathy than some with Dan's
> original commentary, is the issue of a fixed/signed dataset containing
> links to external sources of data and definitions that are not under the
> signee's control. That is, if my signed RDF dataset includes data expressed
> using schema:Product, and the definition of schema:Product changes, what
> value does my signature have now? This is an issue that I think the WG will
> need to address - that is, we'll need to set a boundary on what should and
> should not be inferred by the presence of whatever crypto doo-hickey
> surrounds the data. IMO, it seems clear that we cannot sign the meaning.
> ... And there's the irony. We can't sign the semantics in a Semantic Web
> dataset unless we also retrieve all externally-referenced sources and sign
> an immutable local copy of those as well (I'm really hoping no one thinks
> that's a good idea ☹ )"
> >
> > Manu: [responding to Dan saying]"> Are we convinced that there is
> application-level value in having assurances over instance data without
> also having them for the schemas and ontologies they are underpinned by?"
> >
> > Manu: Yes, I am. Much of the work in Verifiable Credentials utilize
> schemas that are cached client-side (usually permanently, and enforced by
> software). We don't need schemas to adopt the technology for it to be
> useful. It would be more useful if schema publishing used the technologies,
> but I don't think anyone is placing that as a MUST along this road (because
> there is no need to create a dependency there)."
> >
> > I am sympathetic to Manu's point that it might take years to see how
> signing plays out w.r.t. schemas and remote dependencies, and hopefully
> there is at least some usefulness in having some more building blocks for
> signed RDF in the meantime. Manu - do you have more pointers to the
> "schemas cached client-side" approach that's emerging? Is it documented
> anywhere?
> >
> > As Phil says, " if my signed RDF dataset includes data expressed using
> schema:Product, and the definition of schema:Product changes, what value
> does my signature have now?".
> >
> > Given that the schema speaks also of "the publication of biological and
> pharmaceutical data", it would be good to have an explicit usecase from
> that world, and to work through this issue in that domain. If schema
> caching and/or signing isn't a concern, that would be good to know. If
> there are emerging practices, that would also be good to know.  The most
> obvious topic here would be the application of Verifiable Claims to
> Covid-related "passports", with vaccination records etc. I understand VC is
> being used in that setting. Is VC for covid vaccination (etc.) blocked in
> any way by the absence of the proposed work items in this group? Can a
> usecase be articulated?
> >
> >
> >
> > >
> > > ------------------------
> > >
> > > Speaking as one of the Editors of the input specifiations... As a
> related
> > > aside, and at the risk of completely derailing this thread, it is
> possible to
> > > use the Linked Data Signatures specification to sign data payloads
> that are
> > > Linked Data but are not RDF.
> >
> >
> > Ivan wrote: "I would propose to agree that, for the purpose of this
> charter and WG, the terms RDF and Linked Data are interchangeable; this is
> certainly the way the WG intends to pursue its work."
> >
> > I am glad we're having this conversation, because it is good to
> stabilize some terminology (at least in the purpose of this charter/WG, as
> Ivan says), rather than have the WG be launched on the basis of confusions.
> >
> > I am having a hard time imagining how "...that are Linked Data but are
> not RDF" and "the terms RDF and Linked Data are interchangeable" can be
> simultaneously true; could we walk through an example in the context of
> this charter?
> >
> > Ivan also wrote, "To further narrow down the discussion, let us also
> concentrate on what this charter proposes to do. It proposes to provide a
> standard for the canonicalization of, and to calculate a hash for, an RDF
> Graph or an RDF Dataset. (There are some additional, say, "engineering"
> issues like how to express the algorithms and their result in RDF, but that
> is, comparatively, minor.) That is it."
> >
> > If the "Linked Data Signatures specification" is expected to create new
> W3C technology that is likely applicable outside of RDF, charter reviewers
> ought to know about it.
> >
> > Keeping the gap between the RDF world and everyone else as small as
> possible makes a lot of sense.
> >
> > The most obviously applicable "not an RDF file" artifact we could
> consider here is out-of-band JSON-LD context definition files. For example,
> editing Schema.org's can cause an unchanged installation of Apache Jena to
> give different RDF output from byte-for-byte identical input.
> >
> > But there may also be use cases that are implementable without the RDF
> content being canonicalized, or with the canonicalization being at a
> different level of abstraction (e.g. RDFa-in-HTML content using HTML-level
> canonicalization). There may be important cases where the OWL level of
> abstraction is seen as important by some constituencies.
> >
> >
> > > The Linked Data Signatures signing algorithm consists of 4 phases:
> > >
> > > 1. Canonicalization of input data
> > > 2. Cryptographic hashing
> > > 3. Digitally signing
> > > 4. Expressing the signature
> > >
> > > RDF really only comes into play in steps #1 and #4... and it's
> possible for it
> > > to not come into play at all.
> > >
> > > For example, you can use JCS[1] to canonicalize in step #1, and simple
> > > key-values to express the signature in #4. Workday and Microsoft do
> this today
> > > with one of their Linked Data Cryptosuites.
> > >
> > > Now, do I think this is a good idea -- no, I'm not too keen on it; but
> > > enabling others to put forward alternatives based upon a standard is
> useful.
> > >
> > > Should the WG prioritize this aspect of Linked Data Signatures -- no,
> we
> > > should get the RDF bits right.
> > >
> > > This is why we chose the "Linked Data" moniker... because it's not
> entirely
> > > about RDF... we have folks that don't like RDF that do use JSON-LD
> (and seem
> > > to like it).
> >
> > Are the folks that don't like RDF expecting to join this WG that is
> according to Ivan, entirely devoted to RDF?
> >
> >
> >        Saying that the output of the WG is *only* about RDF would
> > > alienate a significant part of that community... and it would also be
> > > technically incorrect.
> > >
> > > Now, all that said -- we should have a razor sharp focus on getting
> the RDF
> > > bits right, because that's what most of the supporters of the Charter
> need.
> > > Simultaneously, we shouldn't do anything to prevent these non-RDF (but
> still
> > > "Linked Data") use cases... and that's the concern w/ stripping all the
> > > "Linked Data" language out of the charter.
> >
> >
> > +1
> >
> > > It does feel like we're all on the same page here wrt. focus -- we
> don't want
> > > a perma-WG... we want something specific that's highly focused.
> >
> > Yup - totally agree.
> >
> > > Simultaneously, we don't want the future non-RDF stuff to suffer just
> because
> > > people were under the mistaken impression that Linked Data Signatures
> ONLY
> > > works for RDF inputs.
> >
> > I am torn --- as an RDF technologist, absolutely I see value in having
> common infrastructure around bnode labeling. And that can be useful without
> any crypto whatsoever, e.g. as utility functions in software it would be
> handy. Mixed with crypto it absolutely is interesting, but is there perhaps
> a piece of work that might be harder because it engages with more groups,
> which pushes the non-RDF aspects of what's proposed here into a broader W3C
> space? How far can an RDF-agnostic "just sign the bits" approach be made to
> work for the usecases W3C cares most about?
> >
> > I remember you were keeping an eye on the debates around "Signed HTTP
> Exchanges" and Web Packaging, for example. Last I checked in there it
> wasn't clear there was consensus about browser-UI aspects, but maybe there
> could be some other common agendas worth exploring?
> https://github.com/w3c/strategy/issues/171#issuecomment-603280405 etc.
> >
> > cheers,
> >
> > Dan
> >
> > > -- manu
> > >
> > > [1]https://tools.ietf.org/html/rfc8785
> > >
> > > --
> > > Manu Sporny - https://www.linkedin.com/in/manusporny/
> > > Founder/CEO - Digital Bazaar, Inc.
> > > blog: Veres One Decentralized Identifier Blockchain Launches
> > > https://tinyurl.com/veres-one-launches
> > >
> >
> >
> >
> > ----
> > Ivan Herman, W3C
> > Home: http://www.w3.org/People/Ivan/
> > mobile: +33 6 52 46 00 43
> > ORCID ID: https://orcid.org/0000-0003-0782-2704
>
> --
> Hugh
> 023 8061 5652
>
>
>
Received on Tuesday, 11 May 2021 11:43:24 UTC

This archive was generated by hypermail 2.4.0 : Tuesday, 11 May 2021 11:43:26 UTC