W3C home > Mailing lists > Public > semantic-web@w3.org > May 2021

Re: Chartering work has started for a Linked Data Signature Working Group @W3C

From: Guha <guha@google.com>
Date: Tue, 11 May 2021 11:19:35 -0700
Message-ID: <CAPAGhv_J+jpJiS5wqWQ33uNwpBkFrYYyqxxQJcx+YTnHHS60qQ@mail.gmail.com>
To: Melvin Carvalho <melvincarvalho@gmail.com>
Cc: Dan Brickley <danbri@google.com>, Phil Archer <phil.archer@gs1.org>, Dan Brickley <danbri@danbri.org>, Ivan Herman <ivan@w3.org>, Aidan Hogan <aidhog@gmail.com>, Manu Sporny <msporny@digitalbazaar.com>, Markus Sabadello <markus@danubetech.com>, Pierre-Antoine Champin <pierre-antoine@w3.org>, Wendy Seltzer <wseltzer@w3.org>, semantic-web <semantic-web@w3.org>
Ah, this discussion brings back so many memories of 1997 ...

RDF is not a programming language. Please don't use it to do arithmetic.


On Tue, May 11, 2021 at 10:03 AM Melvin Carvalho <melvincarvalho@gmail.com>

> On Tue, 11 May 2021 at 12:37, Dan Brickley <danbri@google.com> wrote:
>> On Tue, 11 May 2021 at 10:45, Phil Archer <phil.archer@gs1.org> wrote:
>>> Hi Dan,
>>> Let me see if I can help here (at the enormous risk of making matters
>>> worse).
>>> On 'partial RDFness', that is, signing something that explicitly depends
>>> on external resources beyond the control of the signer. I agree.
>> Great that we agree, but I actually meant something else by 'partial
>> RDFness' here. In the CSVW WG we called it also "half-hearted RDF-ness"
>> once or twice.
>> The point: this isn't just about letting RDF people have a triples/graphs
>> reading of some useful instance data. It is about W3C's REC-track claims
>> that being RDF data wholeheartedly rather than coincidentally, includes
>> constraints on the how the truth of the whole (graph/description) relates
>> to the truth of its constituent claims.
>> In situations where the data is fully RDF, and for simplicity let's
>> assume well known stable URIs everywhere in the graph, then if the graph
>> consists of n triples t-1, t-2, t-3, ..., t-n., ... then if I claim the
>> description is an accurate description, I am standing equally behind each
>> triple independently. This brings with it the assurance that additional
>> triples e.g. t-23236, can't qualify or undermine the others. If I assert G
>> describes the world accurately, I am saying "t-1 and t-2 and t-3 and
>> t-4 ... t-n" describe the world accurately.
>> If t-23236 says (of whatever entity / URI) "trueUntil": "Thursday", ...
>> "foo": "bar", or "pa12bg12f1g12c2": "FALSE", ... their ability to pollute
>> the rest of the graph or make it unclear whether an asserter of the graph
>> has really asserted t-1, t-2, t-3, ...
>> RDF comes with other baggage/features too. It isn't a part of RDF for
>> schema authors to be able to say "if you don't write consent=false, then
>> consent=true is implied". Parties  who create arbitrary JSON specs, or
>> manage SQL database schemas, may sometimes do this kind of thing.
>> Many non-RDF formats don't work this way. CSV doesn't work this way. JSON
>> doesn't work this way. XML doesn't work this way. RDF has a distinctively
>> atomized sense of meaning, breaking complex claims down to triples.
>> Sometimes this works well, sometimes (cf. the OWL design) even semantic web
>> folks find it annoying and obtuse.
>> There are W3C languages like the (largely failed) GRDDL work, which map
>> from non-RDF XML formats into RDF graphs. The creators of the pre-RDF
>> instance data couldn't be accused of even knowing what RDF is, never mind
>> being commited to its semantics. There are others like Turtle that are more
>> obviously fully on board the RDF train.
>> The drafting around this WG seems to lean towards JSON-LD, where there is
>> some perceived ambivalence towards aspects of RDF (hi Manu!:)
>> Citing again http://manu.sporny.org/2014/json-ld-origins-2/
>> "Kick RDF in the Nuts RDF is a shitty data model."
>> "I personally wanted JSON-LD to be compatible with RDF, but that’s about
>> it. You could convert JSON-LD to and from RDF and get something useful, but
>> JSON-LD had a more sane data model where lists were a first-class
>> construct, you had generalized graphs, and you could use JSON-LD using a
>> simple library and standard JSON tooling. "
> RDF is simply a data model that had trade offs.  It forces everything to
> be a Set, which makes merges elegant and cheap, but some other operations
> fiendishly difficult
> Most advanced users dont know how arrays are structured in RDF, and a
> modern expectation of programmers is that you must have arrays
> Worse, try doing something simple in RDF like  2 + 1 = 3
> You're in for a whole world of pain
> <#alice> <#has> 2 .
> Let's increment that.  Oh wait, no now you get
> <#alice> <#has> 1,2 .
> Well then you delete the 1 first and then insert the 2.  Then a few months
> later you realize you have a race condition and need an atomic update.
> Which pretty much is out of scope of most implementations
> Point here is that RDF has some really nice use cases, but it's not the
> ligua franca that was pushed by so many, and simply is a system that offers
> trade offs, with some opinionated parts like dataypes and lang tags, and
> other bits
> It also carries all that XML technical debt from 2002 and before
> What IMHO is better is JSON with hyperlinks.  Everyone knows it, everyone
> can use it.  Does all things well.  You could even use the turtle syntax
> <./foo> in a literal and transpile it to JSON-LD, make keys/predicates as
> URIs optional, allow arrays.  And have the tooling extract triples/quads
> from the JSON blob, and ignore the bits it didnt understand.  You could
> even give a default prefix like urn:json to keys that didnt have a URI
> Signing thing becomes canonicalizing JSON and adding a signature to the
> payload.  Lots of people would use that, works with schema.org works with
> all the JSON APIs out there.  Add a @view tag and you can have nice widgets
> like with adaptivecards.  Extend @context to have some transform stuff that
> changes one shape to another, a middle ground between RDFS and RIF
> Pitch RDF as a useful tool at web scale, for merging data sets, and in the
> enterprise, for search engine stuff.  Show how things like indexing, and
> follow your nose, quad, open world assumption can make powerful apps.  The
> show how it works with JSON, and appreciate JSON on it's own merits
> With that approach you can add signatures to both systems, and I think
> there'd probably be an audience ...
> Just my 2 cents ...
>> This is a legitimate point of view. JSON-LD is defined by its W3C
>> specifications and to some extent by the pragmatics of how it is actually
>> used, rather than the aggregate of the opinions of its creators and spec
>> editors. But it shines a light on whether this WG is on board with what W3C
>> claims RDF data structures mean, when considered to be sets of statements
>> about the world.
>> Another example that crosses back into Phil's point here, is around
>> unique identification in the face of bnodes using reference-by-description
>> constructions. OWL provides a notion of inverse functional property, so
>> that you can use property values as indirect identifiers. If the instance
>> data is understood to be OWL written in RDF written in JSON-LD, readers of
>> the claims could read some data as "*the* x whose foo is bar", e.g. '
>> *the* Language whose bcp47 code is "he".' The RDF layer doesn't provide
>> the reading of this as saying '*the*'; viewed purely as RDF triples it'd
>> be more like saying "*a* Language whose bcp47 code is "he". And to
>> Phil's point below, even the claim that the bcp47 property is
>> owl:InverseFunctional would depend on actually having access to the
>> (possibly evolving) remote schema. Not to mention knowing or caring what
>> the associated OWL bits of it mean.
>> My point here isn't that W3C's problem is not just getting RDF-skeptics
>> to approve or tolerate the group, but that even if you're on board with
>> RDF, it's a slippery slope (or a majestic ascent) to things like OWL, which
>> bring more expressivity to things which ultimately might be written in
>> triples that are encoded in JSON and signed via bnode labelling.
>> Saying "it's RDF" and "it's signed" --- what does that mean in practice
>> about the stuff being signed? Presumably something more than "it could be
>> loaded into an RDF database". But what exactly? Is the atomicity of RDF
>> graphs being a bunch of AND-ed independent statements part of the picture,
>> but OWL not? (unless a bunch of OWL folks join the WG?). Or is the RDF-ness
>> of the group purely "you could convert to and from RDF and get something
>> useful"?
>> And I believe everyone else does to. That is, if I have a bunch of quads
>>> that use property ex:foo, but I don't control ex: then, clearly, there is a
>>> boundary on the integrity of what I have signed. How we tackle that will be
>>> for the WG to decide but my expectation is that signatures/proofs will be
>>> timestamped and the relying party will have to judge whether or not they
>>> trust the controllers of ex: sufficiently that my signature counts for
>>> anything. If ex: is under SDO-level change management, such as
>>> schema.org, a vocab on w3.org or, if you'll allow me, the GS1 Web Voc,
>>> the relying party may well trust it - but I agree, there needs to be some
>>> sort of explicit flag that says "this signature/proof was made at time/date
>>> X, any change outside this data since then may or may not render this
>>> meaningless but we trust those external parties sufficiently to sign this".
>>> But we're talking hypothetically here. Let's try and think of a real
>>> world scenario. Suppose a manufacturer signs some data today that includes
>>> a triple like
>>> <ex:Product> <gs1:allergenStatement> "Does not contain nuts".
>> https://www.gs1.org/voc/allergenStatement rdfs:comment "Textual
>> description of the presence or absence of allergens as governed by local
>> rules and regulations, specified as one string."
>> That depends on a term in the GS1 Web Voc that, yes, I can change and the
>>> manufacturer can't. But I won't because we have a change management process
>>> that says we won't make any change without consultation with our community,
>>> and a general policy of not breaking stuff if we can help it
>> https://web.archive.org/web/20160718151441/https://www.gs1.org/voc/allergenStatement
>> confirms it hasn't been changed yet. Not clear what "local rules and
>> regulations" means, but it has always been so afaik.
>>> The manufacturer, i.e. the signer, and the relying party can use their
>>> judgement on this. Therefore I suggest that in this business scenario, the
>>> signature/proof has meaning.
>>> So now we need a counter example. OK, so the triple becomes
>>> <ex:Product> <badActor:allergenStatement> "Does not contain nuts"
>>> Here, a manufacturer (ex: ) has used the badActor:allergenStatement
>>> predicate. Two minutes after the manufacturer signs that statement,
>>> badActor changes it definitions to say "all values of this property are the
>>> inverse of the truth." What value does that signature/proof have then? No
>>> more and no less than it did. At the time we signed it, the statements were
>>> true.
>> Earlier in this thread I asked what the impact might be on schema
>> maintainers - and you can imagine my check via archive.org becoming more
>> institutionalized through our community practices - e.g. facilitated at
>> projects like LOV, or W3C or Internet Archive. But I won't go on about this
>> as my original point was the one above about rdf-ness, not about the
>> external dependencies aspect of signature semantics.
>> The issue isn't the naming of the WG, it's the expectations that can come
>> with saying "this is RDF", and doing that before people start jumping on
>> (post-pandemic) planes to fly around to WG meetings to discover their
>> colleagues in the WG have radically different expectations...
>> Dan
>>> Here the signer has placed their trust in badActor:. That's probably a
>>> really dumb thing for them to have done and the relying party will need to
>>> assess whether *they* trust badActor and, by extension, the manufacturer.
>>> This *is* an issue the WG will need to tackle. I can imagine that a
>>> signature/proof *may* have a placeholder where external dependencies are
>>> listed, for example - but we're way ahead of ourselves here and I cannot
>>> predict what others will deem a good idea.
>>> As for signing individual statements, well, that's something we might
>>> want to talk to the RDF* folks about. I recall many lively conversations
>>> over the years about the immutability of RDF statements and whether all
>>> assertions exist with equal certainty until the heat death of the universe.
>>> Clearly they don't - hence RDF*.
>>> I've talked a lot here about RDF, triples and quads. The proposed
>>> charter talks about RDF and Linked Data. The first mention of the term
>>> Linked Data (after the WG's title) links to
>>> https://www.w3.org/standards/semanticweb/data that opens with:
>>> "The Semantic Web is a Web of Data — of dates and titles and part
>>> numbers and chemical properties and any other data one might conceive of.
>>> The collection of Semantic Web technologies (RDF, OWL, SKOS, SPARQL, etc.)
>>> provides an environment where application can query that data, draw
>>> inferences using vocabularies, etc."
>>> That's not shy about using the term RDF. The proposed WG's mission
>>> statement also cites RDF Dataset Canonicalization, concrete RDF syntaxes
>>> and more. For me, it's pretty clear that if you/your employer has
>>> antibodies against RDF - and we all know that many such people and
>>> organisations exist - then this WG is not for you.
>>> But as I said last time in this thread, IMO the term Linked Data has
>>> evolved, at least in the way it's used in business-oriented discussions. In
>>> my work we talk about "Linksets", "links to other sources of data", and
>>> abuse the word "semantic" at every turn. I have a fortnightly meeting in my
>>> calendar called "Moving towards a GS1 Semantic". Colleagues create "data
>>> models" in Excel. I'm an atheist but I recite the serenity prayer daily [1].
>>> The point being that Linked Data Signatures is well named. It clearly
>>> *is* in the RDF/Semantic Web camp, but it has elements in it that will
>>> allow us to talk about the work in a broader, less-technical, more
>>> business-focused environment. When I and one or two others talk about
>>> Linked Data at GS1 it's understood to mean decentralized data, the Web of
>>> data, silo-busting etc. We can use it with confidence. So Linked Data
>>> Integrity and Linked Data Security Vocabulary are terms that have meaning.
>>> Those audiences know and accept that there are important technical details
>>> that need to be addressed - that's the RDF bit.
>>> Talk of RDF Dataset Canonicalization etc. will, inevitably, limit
>>> membership of the WG. But that's no different from any other WG. For
>>> example, my organisation has no interest in participating in WGs that
>>> define the Web as experienced in the browser, for example. So knock
>>> yourself out CSS WG, Pointer Events, SVG, Web Applications and all the
>>> others. We're glad you're there but we'll leave that stuff to you.
>>> LD and RDF are both terms in common usage. Rightly or wrongly, we use
>>> them interchangeably. Maybe we should put (sic) after every mention of
>>> Linked Data? RDF c14n has been talked about since the early days of RDF
>>> when you were there and I wasn't. It's never been formalised. But as the
>>> acceptance and use of Linked Data as a concept has grown in areas like the
>>> one where I now work, and with the advent of Verifiable Credentials, we
>>> need this. Can we not worry so much about the naming of the thing? Please?
>>> Phil
>>> [1] https://en.wikipedia.org/wiki/Serenity_Prayer
>>> Phil Archer
>>> Director, Web Solutions, GS1
>>> https://www.gs1.org
>>> Meet GS1 Digital Link Developers at
>>> https://groups.google.com/forum/#!forum/gs1-digital-link-developers
>>> https://philarcher.org
>>> +44 (0)7887 767755 <07887%20767755>
>>> @philarcher1
>>> Skype: philarcher
>>> On 10 May 2021 19:38, Dan Brickley wrote:
>>> On Mon, 10 May 2021 at 19:23, Ivan Herman <ivan@w3.org <mailto:
>>> ivan@w3.org> > wrote:
>>>                 On 10 May 2021, at 18:58, Dan Brickley <
>>> Danbri@danbri.org <mailto:Danbri@danbri.org> > wrote:
>>>                 Thanks for reworking the docs based on all of the giant
>>> discussions!
>>>                 On naming and RDFness, nobody is against pragmatism. The
>>> problem is that everyone sees their own preferences as the most pragmatic.
>>>                 As you describe it below, W3C here is skating
>>> dangerously close to saying that it is drafting this work in such a way as
>>> to mislead the management of its Member organizations to such an extent
>>> that staff would be assigned to the WG under false pretences, and that a
>>> more honestly described workplan would not garner support. Presumably this
>>> also applies to AC approval, since it is also the management of W3C member
>>> orgs being consulted.
>>>                 The pragmatic view in my estimation (and potentially
>>> Google’s once we have discussed internally) is that it is better to have
>>> these things out in the open before the WG is spawned rather than bickered
>>> over expensively afterwards.
>>>         Can you be more specific to understand what you would propose
>>> (taking also into account the constraints that I described below)?
>>> When you wrote "the practical reality is that we had feedbacks from
>>> people saying their management may not allow them to participate on the
>>> working group is it is perceived as being a pure RDF work", while also
>>> suggesting the scope is indeed very RDF oriented ("exchange and the
>>> integration of simple factual data expressed in RDF."), it feels like a
>>> contradiction best resolved in charter-drafting phase, rather than during
>>> the WG. Specifically if the WG is in fact very much focussed on doing
>>> things with RDF data, anyone (a) staffing it (b) approving the WG charter,
>>> ... ought to know that.
>>> My proposal is simple: not to pretend it's not RDF-centric when it is,
>>> because the pain will only be postponed.
>>> Dan
>>>                 Quick example to suggest this goes beyond mere naming:
>>>                 If the content being signed claims in rdf that
>>>                  entityuri1 has prop1 with val2;
>>>                  and prop2 with val3;
>>>                 and prop4 with val4...
>>>                 RDF goes to extraordinary lengths to make these
>>> different triples independent. If you assert them all, you are hardpressed
>>> to say “hey it was all or nothing”. Whereas if you operating at the JSON
>>> level and sign this you could point at eg prop4 being “thisRecordTrueUntil”
>>> and val4 being “2021”.
>>>                 We have barely touched on how the partial RDFness
>>> touches on meaning attached to signing, is there potential for mixed
>>> expectations here?
>>>         The "out of scope" list in the charter now includes:
>>>         "Authenticity and trust issues of Web (Data) content that go
>>> beyond the exchange and the integration of simple factual data expressed in
>>> RDF."
>>>         (I guess you will recognize this text). In my view, this covers
>>> the situation that you describe. Is there anything specific that you could
>>> propose as an additional item in the list?
>>>         In general, it would really be good at this point if we could
>>> discuss specific changes on the documents...
>>>         Thanks
>>>         Cheers,
>>>         Ivan
>>>                 Dan
>>>                 On Mon, 10 May 2021 at 15:08, Ivan Herman <ivan@w3.org
>>> <mailto:ivan@w3.org> > wrote:
>>>                         (This is not a direct reply on this specific
>>> message, but I was not sure on which message in the thread I should hook
>>> this:-)
>>>                         Dear all,
>>>                         thanks for all the discussions. We (ie, the the
>>> proposed co-chairs of the WG, the editors of some of the main input
>>> documents, etc) had a series of discussions and we have now an updated
>>> version of the charter and the explainer document:
>>>                         https://w3c.github.io/lds-wg-charter/
>>> https://w3c.github.io/lds-wg-charter/explainer.html
>>>                         we tried to answer to the concerns expressed on
>>> this thread by removing some unclear statements, adding some extra
>>> explanations to the explainer document, putting certain issues explicitly
>>> in the 'out-of-scope' sections, etc).
>>>                         On the contentious issue of naming, ie, Linked
>>> Data vs. RDF, we have to be pragmatic on this. Theoretical purity may
>>> require to use only the term RDF; the practical reality is that we had
>>> feedbacks from people saying their management may not allow them to
>>> participate on the working group is it is perceived as being a pure RDF
>>> work but it is o.k. if the work is on Linked Data. We have to live with
>>> that, and have the naming issue discussed on another day. Nevertheless, we
>>> tried to come up with a slightly more detailed background un the explainer
>>> document (rather than the charter itself; there is a requirement, by the AC
>>> members of the W3C, to keep the charter as succinct as possible).
>>>                         Thanks again for all the input,
>>>                         Ivan
>>>                                 On 4 May 2021, at 17:55, Dan Brickley <
>>> danbri@google.com <mailto:danbri@google.com> > wrote:
>>>                                 On Tue, 4 May 2021 at 15:40, Manu Sporny
>>> <msporny@digitalbazaar.com <mailto:msporny@digitalbazaar.com> > wrote:
>>>                                 >
>>>                                 > On 5/4/21 10:01 AM, Dan Brickley wrote:
>>>                                 > > For now I'd just add: let's not wait
>>> until the WG is chartered before
>>>                                 > > clarifying usecases - the lack of
>>> these may be why there's apparently
>>>                                 > > disagreement amongst the works
>>> primary advocates on what is in vs out of
>>>                                 > > scope.
>>>                                 >
>>>                                 > Dan, have you seen the current set of
>>> use cases?
>>>                                 >
>>>                                 >
>>> https://w3c.github.io/lds-wg-charter/explainer.html#usage
>>>                                 Yes. My concern in the original post was
>>> that:
>>>                                 The charter opens as follows:
>>>                                 “ There are a variety of established use
>>> cases, such as Verifiable Credentials <
>>> https://www.w3.org/TR/vc-data-model> , the publication of biological
>>> and pharmaceutical data, consumption of mission critical RDF vocabularies,
>>> and others, that depend on the ability to verify the authenticity and
>>> integrity of the data being consumed (see the use cases <
>>> https://w3c.github.io/lds-wg-charter/explainer.html#usage>  for more
>>> examples).”
>>>                                 Currently the charter only alludes
>>> wavily to a “variety of established use cases”, and cites its specific “use
>>> cases” for “more”.
>>>                                 ... i.e. those that you're pointing to
>>> are additional to presumed widely known usecases, ... they're "more", not
>>> the core.
>>>                                 The first sentence of the charter
>>> grounds its importance in terms of "The deployment of Linked Data is
>>> increasing at a rapid pace.", and we understand from Ivan that this means
>>> the same as The deployment of RDF is increasing at a rapid pace". It links
>>> to http://webdatacommons.org/structureddata/#toc3 which is about
>>> "Microdata, RDFa, JSON-LD, and Microformat Data Sets", from public web
>>> crawl extractions by the webdatacommons team.
>>>                                 The charter talks about "Detecting
>>> changes in datasets" as a typical usecase. It would be good to tie that to
>>> any of the "increasing at a rapid pace" adoption reported in
>>> http://webdatacommons.org/structureddata/.
>>>                                 Consider that for the GS1-related /
>>> Product data usecases, Phil seems to see things differently from Manu.
>>>                                 Phil: "Where I think I seem to have more
>>> sympathy than some with Dan's original commentary, is the issue of a
>>> fixed/signed dataset containing links to external sources of data and
>>> definitions that are not under the signee's control. That is, if my signed
>>> RDF dataset includes data expressed using schema:Product, and the
>>> definition of schema:Product changes, what value does my signature have
>>> now? This is an issue that I think the WG will need to address - that is,
>>> we'll need to set a boundary on what should and should not be inferred by
>>> the presence of whatever crypto doo-hickey surrounds the data. IMO, it
>>> seems clear that we cannot sign the meaning. ... And there's the irony. We
>>> can't sign the semantics in a Semantic Web dataset unless we also retrieve
>>> all externally-referenced sources and sign an immutable local copy of those
>>> as well (I'm really hoping no one thinks that's a good idea ☹ )"
>>>                                 Manu: [responding to Dan saying]"> Are
>>> we convinced that there is application-level value in having assurances
>>> over instance data without also having them for the schemas and ontologies
>>> they are underpinned by?"
>>>                                 Manu: Yes, I am. Much of the work in
>>> Verifiable Credentials utilize schemas that are cached client-side (usually
>>> permanently, and enforced by software). We don't need schemas to adopt the
>>> technology for it to be useful. It would be more useful if schema
>>> publishing used the technologies, but I don't think anyone is placing that
>>> as a MUST along this road (because there is no need to create a dependency
>>> there)."
>>>                                 I am sympathetic to Manu's point that it
>>> might take years to see how signing plays out w.r.t. schemas and remote
>>> dependencies, and hopefully there is at least some usefulness in having
>>> some more building blocks for signed RDF in the meantime. Manu - do you
>>> have more pointers to the "schemas cached client-side" approach that's
>>> emerging? Is it documented anywhere?
>>>                                 As Phil says, " if my signed RDF dataset
>>> includes data expressed using schema:Product, and the definition of
>>> schema:Product changes, what value does my signature have now?".
>>>                                 Given that the schema speaks also of
>>> "the publication of biological and pharmaceutical data", it would be good
>>> to have an explicit usecase from that world, and to work through this issue
>>> in that domain. If schema caching and/or signing isn't a concern, that
>>> would be good to know. If there are emerging practices, that would also be
>>> good to know.  The most obvious topic here would be the application of
>>> Verifiable Claims to Covid-related "passports", with vaccination records
>>> etc. I understand VC is being used in that setting. Is VC for covid
>>> vaccination (etc.) blocked in any way by the absence of the proposed work
>>> items in this group? Can a usecase be articulated?
>>>                                 >
>>>                                 > ------------------------
>>>                                 >
>>>                                 > Speaking as one of the Editors of the
>>> input specifiations... As a related
>>>                                 > aside, and at the risk of completely
>>> derailing this thread, it is possible to
>>>                                 > use the Linked Data Signatures
>>> specification to sign data payloads that are
>>>                                 > Linked Data but are not RDF.
>>>                                 Ivan wrote: "I would propose to agree
>>> that, for the purpose of this charter and WG, the terms RDF and Linked Data
>>> are interchangeable; this is certainly the way the WG intends to pursue its
>>> work."
>>>                                 I am glad we're having this
>>> conversation, because it is good to stabilize some terminology (at least in
>>> the purpose of this charter/WG, as Ivan says), rather than have the WG be
>>> launched on the basis of confusions.
>>>                                 I am having a hard time imagining how
>>> "...that are Linked Data but are not RDF" and "the terms RDF and Linked
>>> Data are interchangeable" can be simultaneously true; could we walk through
>>> an example in the context of this charter?
>>>                                 Ivan also wrote, "To further narrow down
>>> the discussion, let us also concentrate on what this charter proposes to
>>> do. It proposes to provide a standard for the canonicalization of, and to
>>> calculate a hash for, an RDF Graph or an RDF Dataset. (There are some
>>> additional, say, "engineering" issues like how to express the algorithms
>>> and their result in RDF, but that is, comparatively, minor.) That is it."
>>>                                 If the "Linked Data Signatures
>>> specification" is expected to create new W3C technology that is likely
>>> applicable outside of RDF, charter reviewers ought to know about it.
>>>                                 Keeping the gap between the RDF world
>>> and everyone else as small as possible makes a lot of sense.
>>>                                 The most obviously applicable "not an
>>> RDF file" artifact we could consider here is out-of-band JSON-LD context
>>> definition files. For example, editing Schema.org <http://Schema.org>
>>> 's can cause an unchanged installation of Apache Jena to give different RDF
>>> output from byte-for-byte identical input.
>>>                                 But there may also be use cases that are
>>> implementable without the RDF content being canonicalized, or with the
>>> canonicalization being at a different level of abstraction (e.g.
>>> RDFa-in-HTML content using HTML-level canonicalization). There may be
>>> important cases where the OWL level of abstraction is seen as important by
>>> some constituencies.
>>>                                 > The Linked Data Signatures signing
>>> algorithm consists of 4 phases:
>>>                                 >
>>>                                 > 1. Canonicalization of input data
>>>                                 > 2. Cryptographic hashing
>>>                                 > 3. Digitally signing
>>>                                 > 4. Expressing the signature
>>>                                 >
>>>                                 > RDF really only comes into play in
>>> steps #1 and #4... and it's possible for it
>>>                                 > to not come into play at all.
>>>                                 >
>>>                                 > For example, you can use JCS[1] to
>>> canonicalize in step #1, and simple
>>>                                 > key-values to express the signature in
>>> #4. Workday and Microsoft do this today
>>>                                 > with one of their Linked Data
>>> Cryptosuites.
>>>                                 >
>>>                                 > Now, do I think this is a good idea --
>>> no, I'm not too keen on it; but
>>>                                 > enabling others to put forward
>>> alternatives based upon a standard is useful.
>>>                                 >
>>>                                 > Should the WG prioritize this aspect
>>> of Linked Data Signatures -- no, we
>>>                                 > should get the RDF bits right.
>>>                                 >
>>>                                 > This is why we chose the "Linked Data"
>>> moniker... because it's not entirely
>>>                                 > about RDF... we have folks that don't
>>> like RDF that do use JSON-LD (and seem
>>>                                 > to like it).
>>>                                 Are the folks that don't like RDF
>>> expecting to join this WG that is according to Ivan, entirely devoted to
>>> RDF?
>>>                                        Saying that the output of the WG
>>> is *only* about RDF would
>>>                                 > alienate a significant part of that
>>> community... and it would also be
>>>                                 > technically incorrect.
>>>                                 >
>>>                                 > Now, all that said -- we should have a
>>> razor sharp focus on getting the RDF
>>>                                 > bits right, because that's what most
>>> of the supporters of the Charter need.
>>>                                 > Simultaneously, we shouldn't do
>>> anything to prevent these non-RDF (but still
>>>                                 > "Linked Data") use cases... and that's
>>> the concern w/ stripping all the
>>>                                 > "Linked Data" language out of the
>>> charter.
>>>                                 +1
>>>                                 > It does feel like we're all on the
>>> same page here wrt. focus -- we don't want
>>>                                 > a perma-WG... we want something
>>> specific that's highly focused.
>>>                                 Yup - totally agree.
>>>                                 > Simultaneously, we don't want the
>>> future non-RDF stuff to suffer just because
>>>                                 > people were under the mistaken
>>> impression that Linked Data Signatures ONLY
>>>                                 > works for RDF inputs.
>>>                                 I am torn --- as an RDF technologist,
>>> absolutely I see value in having common infrastructure around bnode
>>> labeling. And that can be useful without any crypto whatsoever, e.g. as
>>> utility functions in software it would be handy. Mixed with crypto it
>>> absolutely is interesting, but is there perhaps a piece of work that might
>>> be harder because it engages with more groups, which pushes the non-RDF
>>> aspects of what's proposed here into a broader W3C space? How far can an
>>> RDF-agnostic "just sign the bits" approach be made to work for the usecases
>>> W3C cares most about?
>>>                                 I remember you were keeping an eye on
>>> the debates around "Signed HTTP Exchanges" and Web Packaging, for example.
>>> Last I checked in there it wasn't clear there was consensus about
>>> browser-UI aspects, but maybe there could be some other common agendas
>>> worth exploring?
>>> https://github.com/w3c/strategy/issues/171#issuecomment-603280405 etc.
>>>                                 cheers,
>>>                                 Dan
>>>                                 > -- manu
>>>                                 >
>>>                                 > [1]https://tools.ietf.org/html/rfc8785
>>>                                 >
>>>                                 > --
>>>                                 > Manu Sporny -
>>> https://www.linkedin.com/in/manusporny/
>>>                                 > Founder/CEO - Digital Bazaar, Inc.
>>>                                 > blog: Veres One Decentralized
>>> Identifier Blockchain Launches
>>>                                 > https://tinyurl.com/veres-one-launches
>>>                                 >
>>>                         ----
>>>                         Ivan Herman, W3C
>>>                         Home: http://www.w3.org/People/Ivan/
>>>                         mobile: +33 6 52 46 00 43
>>> <+33%206%2052%2046%2000%2043>
>>>                         ORCID ID: https://orcid.org/0000-0003-0782-2704
>>> CONFIDENTIALITY / DISCLAIMER: The contents of this e-mail are
>>> confidential and are not to be regarded as a contractual offer or
>>> acceptance from GS1 (registered in Belgium).
>>> If you are not the addressee, or if this has been copied or sent to you
>>> in error, you must not use data herein for any purpose, you must delete it,
>>> and should inform the sender.
>>> GS1 disclaims liability for accuracy or completeness, and opinions
>>> expressed are those of the author alone.
>>> GS1 may monitor communications.
>>> Third party rights acknowledged.
>>> (c) 2020.
Received on Tuesday, 11 May 2021 18:21:06 UTC

This archive was generated by hypermail 2.4.0 : Tuesday, 11 May 2021 18:21:08 UTC