- From: Andy Seaborne <andy@seaborne.org>
- Date: Mon, 3 May 2021 13:22:31 +0100
- To: semantic-web@w3.org
On 03/05/2021 10:06, Ivan Herman wrote: ... > In my view the cleanest way would be to make it clear, either in the > charter text, or the explainer, that we consider these terms, for the > purposes of this Working Group, as synonyms. Additionally, we may also > want to list some problems whose solutions are explicitly out of scope > (although we have to have a clear set of terms for those). I would be > pleased to hear more suggestions. The charter is still in developments, > ie, this is the time to do it! The charter has 2 sections - deliverables 1 and 2 are about single RDF Datasets, 3 and 4 about linkages and "Linked Datasets". It would be helpful to rename deliverable 2 Linked Data Hash => RDF Dataset Hash to reflect the value of 1+2 to communities that do not see themselves as "Linked Data". Andy > > Thanks > > Ivan > > > [1] https://w3c.github.io/lds-wg-charter/ > <https://w3c.github.io/lds-wg-charter/> > [2] https://w3c.github.io/lds-wg-charter/explainer.html > <https://w3c.github.io/lds-wg-charter/explainer.html> > > >> On 1 May 2021, at 12:27, Dan Brickley <danbri@danbri.org >> <mailto:danbri@danbri.org>> wrote: >> >> >> I have concerns. If I had had more time I would have written a shorter >> email. >> >> >> >> Starting from the top - >> >> Is “Linked Data” in the group name serving as a synonym for RDF? >> >> Are there in-scope usecases for non-RDF content? eg property graphs? >> RIF? Microformats? Plain XML, JSON? >> >> Does saying “Linked Data” exclude any RDF practices deemed >> insufficiency “Linked”? >> >> The charter cites >> http://webdatacommons.org/structureddata/#toc3 >> <http://webdatacommons.org/structureddata/#toc3> in support of the >> vague/ambiguous claim that “ The deployment of Linked Data >> <https://www.w3.org/standards/semanticweb/data> is increasing at a >> rapid pace <http://webdatacommons.org/structureddata/#toc3>”, yet the >> citation points to a document focussed on approaches which in various >> ways go against “Linked Data” orthodoxy, narrowly conceived. >> >> The webdatacommons report covers Microdata, RDFa, JSON-LD, and even >> Microformats; the latter effort has long distanced itself from RDF, >> Linked Data and so on. The others, as published in the public Web, are >> very commonly found embedded in containing documents (or even injected >> via Javascript into a running webplatform document object), and being >> used as standalone bnode-heavy descriptions rather than fragmentary >> pieces of hypertext RDF. >> >> A particular problem with calling the group “Linked Data” is the >> expectation that the various (and contested) publishing practices >> associated with the Linked Data slogan will get tangled up in the >> technical work. >> >> For example, the Linked Data community emphasises public data, often >> but not always “Linked Open Data”, and has a strong bias towards RDF >> being published in a form such that all mentioned entities are >> described with a URI. It also has a bias toward those URIs being >> http(s)-dereferencable, with the resulting document containing >> additional RDF statements pertaining directly or indirectly to the >> entity the URI is considered to identify. Arcane rules regarding http >> redirect codes and the use of #-based identifiers for non-webplatform >> entities are also an important element of the post-2006 Linked Data >> tradition. >> >> By proposing to name the group “Linked Data” W3C risks embedding these >> contested design preferences in the technical work, while justifying >> the WG as impactful using the large scale adoption of practices bases >> on json-ld, microdata, rdfa which actively make different design >> choices from those implicitly endorsed by this naming choice. >> >> Specifically, Schema.org <http://Schema.org> using these formats is on >> millions of sites (eg report led by webdatacommons), in large part by >> making the explicit choice to make things easier for publishers, e.g. >> by allowing them to write markup meaning roughly “the Country whose >> name is Paris” rather than following >> Linked Data supposed best practice of simply using a well known URI >> for the entity, such as >> http://dbpedia.org/resource/Paris <http://dbpedia.org/resource/Paris> >> (which would involve publishers finding out the mosg currently >> fashionable URI for every entity they mention). Signing data that >> mostly consists of dangling references to files on other people’s >> websites may be a solved mathematical problem, but it is new territory >> in social, policy, workflow, ecosystem and other ways. If W3C values >> such an endeavour it should be realistic in terms of staff resources >> assigned, and timelines. This is not a “quick win” project. >> >> >> The chartering issue is that “Linked Data” is a broad marketing >> euphemism for RDF that emphasises some but not all of its strengths, >> such as the ease of data merging across loosely coupled systems. But >> it is not a technical term or a W3C standard as such. >> >> >> >> If this is effectively an RDF canonicalization WG there are other >> issues to discuss, such as its impact on expectations around schema >> evolution, linking, and security. >> >> Without being exhaustive, ... >> >> Would it apply to schemas published at http: URIs or only https: URIs? >> >> Are we convinced that there is application-level value in having >> assurances over instance data without also having them for the schemas >> and ontologies they are underpinned by? >> >> Is there an expectation that schema/ontology publishing practice would >> need to change to accommodate these scenarios? >> >> Would schema-publishing organizations like Dublin Core, Schema.org >> <http://Schema.org>, Wikidata, DBpedia, be expected to publish a >> JSON-LD (1.0? 1.1?) context file? What change management, versioning, >> etc practices would be required? Would special new schemas be needed >> instead? >> >> For eg. if instance data created in 2019 uses a schema ex:Foo type >> last updated in 2021, but which has since 2018 contained an assertion >> of owl:equivalentClass to ex2:Bar, and an rdfs:subClassOf ex3:Xyz, are >> changes to the definitions of these supposed to be relevant to the >> trustability of the instance data? If so, why does >> https://w3c.github.io/lds-wg-charter/index.html >> <https://w3c.github.io/lds-wg-charter/index.html> not discuss the role >> of schema/ontology definitions in all this? >> >> For concrete example of why 24 months looks ambitious: >> >> The examples in >> https://w3c-ccg.github.io/security-vocab/ >> <https://w3c-ccg.github.io/security-vocab/> >> { "@context": ["https://w3id.org/security/v1 >> <https://w3id.org/security/v1>", >> "http://json-ld.org/contexts/person.jsonld >> <http://json-ld.org/contexts/person.jsonld>"] "@type": "Person", >> "name": "Manu Sporny", "homepage": "http://manu.sporny.org/ >> <http://manu.sporny.org/>", "signature": { "@type": >> "GraphSignature2012", "creator": "http://manu.sporny.org/keys/5 >> <http://manu.sporny.org/keys/5>", "signatureValue": >> "OGQzNGVkMzVmMmQ3ODIyOWM32MzQzNmExMgoYzI4ZDY3NjI4NTIyZTk=" } } >> >> This uses the following json-ld context: >> >> http://json-ld.org/contexts/person.jsonld >> <http://json-ld.org/contexts/person.jsonld> >> >> >> ...which currently maps the term “Person” in the instance data to >> foaf:Person, which is a schema we have published in the FOAF project >> since ~ May 2000 or so, evolving the definition in place. We used to >> PGP sign the RDFS RDF/XML files btw; I am not entirely against signing >> and RDF! Nobody used it though. >> >> From person.jsonld above, >> >> { >> "@context": >> { >> "Person": "http://xmlns.com/foaf/0.1/Person <http://xmlns.com/foaf/0.1/Person>",... >> >> The current English definition of foaf:Person says “ The |Person >> <http://xmlns.com/foaf/spec/#term_Person>| class represents people. Something is a |Person >> <http://xmlns.com/foaf/spec/#term_Person>| if it is a person. We don't nitpic about whether they're alive, dead, >> real, or imaginary”. >> Its rdf/xml (“Linked Data”) definition says, amongst other things, >> that it is owl:equivalentClass to schema:Person. >> Do we want a spec that cares about whether the context file is served >> over http? That cares if the dependency on FOAF is silently switched >> out, or whether the FOAF Person type’s “Linked Data” stated >> equivalence to >> http://schema.org/Person <http://schema.org/Person> gets updated, e.g. >> to use https://schema.org <https://schema.org/> and/or to converge the >> written definitions which set the meaning of what it is to say that >> something is a foaf:Person or schema:Person. >> >> These are all fascinating issues but I would be astonished if the work >> gets done on the proposed schedule. The very idea of Linked Data puts >> these URI-facilitated connections between RDF graphs at its core. To >> omit discussion of their consequences in the charter is odd. For >> example, when is one the “authenticity and integrity” of one >> serialized / published graph dependent on that of another that it >> mentions/references/uses? >> >> I am not against this work, but the draft charter feels really off >> somehow. >> >> RDF with lots of blank nodes is known to be a bit annoying to consume, >> but easier to publish. The general sections of the charter make >> sweeping and grand claims about the utility of the proposed standards, >> and justify that with phrases like “authenticity and integrity of the >> data” and references to the adoption of json-ld, microdata and rdfa in >> public web content. >> >> The usecases most explicitly listed are however largely from rather >> different perspective - a lot of blockchainy transactional scenarios, >> some frankly blueskies but intriguing: >> >> “ For example, anchoring an RDF Dataset that expresses a land deed to >> a Distributed Ledger (aka blockchain) can establish a proof of >> existence in a way that does not depend on a single point of failure, >> such as a local government office“ >> >> ... which echoes TimBL’s old >> https://www.w3.org/Talks/WWW94Tim/ <https://www.w3.org/Talks/WWW94Tim/> >> >> I do not want to see a repeat of the JSON-LD 1.0 vs 1.1 debacle, in >> which the massive success of Schema.org <http://Schema.org>’s use of >> JSON-LD 1.0 in the public Web was used to persuade the W3C AC to >> launch a Working Group focussed on just those aspects of the >> technology (contexts) which don’t work well for the web scale search, >> and which didn’t address the needs of the project that had been uses >> to justify the WG. As discussed elsewhere this week, that effort >> resulted in W3C marking as superseded/abandoned the very technology >> (JSON-LD 1.0) that we at Schema.org <http://Schema.org> were proud to >> have helped to success, and which we now can’t even reliably cite as a >> stable web standard. >> >> If this WG is addressing needs around RDF for blockchains, or >> supporting software to compare, check and maybe diff RDF graphs, the >> charter should be clearer about this limited scope. >> >> The charter opens as follows: >> >> “ There are a variety of established use cases, such as Verifiable >> Credentials <https://www.w3.org/TR/vc-data-model>, the publication of >> biological and pharmaceutical data, consumption of mission critical >> RDF vocabularies, and others, that depend on the ability to verify the >> authenticity and integrity of the data being consumed (see the use >> cases <https://w3c.github.io/lds-wg-charter/explainer.html#usage> for >> more examples).” >> >> Currently the charter only alludes wavily to a “variety of established >> use cases”, and cites its specific “use cases” for “more”. The >> established ones also should be explicitly listed and analyzed to make >> sure they also motivate the proposed specific technical agenda, which >> is highly focussed on technicalities around bnode-labeling in RDF data. >> >> For each of these usecases we should ask, amongst other things, >> whether signing the raw bits might work, and if not, how much >> additional surrounding information is needed - eg base URI, referenced >> schemas/ontologies, json-ld contexts, GRDDL transformes; and whether >> the reference-tracing recurses or not. And why. >> >> Sorry for the long note. I just don’t want to see another RIF-like 5 >> year slog happen because a cloud of similar ideas was mistaken for a >> shared standards-making agenda. >> >> Cheers, >> >> Dan >> >> (Sent from my personal account but with a danbri@google.com >> <mailto:danbri@google.com> hat on) >> >> On Tue, 6 Apr 2021 at 11:26, Ivan Herman <ivan@w3.org >> <mailto:ivan@w3.org>> wrote: >> >> Dear all, >> >> the W3C has started to work on a Working Group charter for Linked >> Data Signatures: >> >> https://w3c.github.io/lds-wg-charter/index.html >> <https://w3c.github.io/lds-wg-charter/index.html> >> >> The work proposed in this Working Group includes Linked Data >> Canonicalization, as well as algorithms and vocabularies for >> encoding digital proofs, such as digital signatures, and with that >> secure information expressed in serializations such as JSON-LD, >> TriG, and N-Quads. >> >> The need for Linked Data canonicalization, digest, or signature >> has been known for a very long time, but it is only in recent >> years that research and development has resulted in mathematical >> algorithms and related implementations that are on the maturity >> level for a Web Standard. A separate explainer document: >> >> https://w3c.github.io/lds-wg-charter/explainer.html >> <https://w3c.github.io/lds-wg-charter/explainer.html> >> >> provides some background, as well as a small set of use cases. >> >> The W3C Credentials Community Group[1,2] has been instrumental in >> the work leading to this charter proposal, not the least due to >> its work on Verifiable Credentials and with recent applications >> and development on, e.g., vaccination passports using those >> technologies. >> >> It must be emphasized, however, that this work is not bound to a >> specific application area or serialization. There are numerous use >> cases in Linked Data, like the publication of biological and >> pharmaceutical data, consumption of mission critical RDF >> vocabularies, and others, that depend on the ability to verify the >> authenticity and integrity of the data being consumed. This >> Working Group aims at covering all those, and we hope to involve >> the Linked Data Community at large in the elaboration of the final >> charter proposal. >> >> We welcome your general expressions of interest and support. If >> you wish to make your comments public, please use GitHub issues: >> >> https://github.com/w3c/lds-wg-charter/issues >> <https://github.com/w3c/lds-wg-charter/issues> >> >> A formal W3C Advisory Committee Review for this charter is >> expected in about six weeks. >> >> [1] https://www.w3.org/community/credentials/ >> <https://www.w3.org/community/credentials/> >> [2] https://w3c-ccg.github.io/ <https://w3c-ccg.github.io/> >> >> >> ---- >> Ivan Herman, W3C >> Home: http://www.w3.org/People/Ivan/ <http://www.w3.org/People/Ivan/> >> mobile: +33 6 52 46 00 43 >> ORCID ID: https://orcid.org/0000-0003-0782-2704 >> <https://orcid.org/0000-0003-0782-2704> >> > > > ---- > Ivan Herman, W3C > Home: http://www.w3.org/People/Ivan/ <http://www.w3.org/People/Ivan/> > mobile: +33 6 52 46 00 43 > ORCID ID: https://orcid.org/0000-0003-0782-2704 > <https://orcid.org/0000-0003-0782-2704> >
Received on Monday, 3 May 2021 12:22:48 UTC