- From: Sebastian Hellmann <hellmann@informatik.uni-leipzig.de>
- Date: Sat, 22 Jan 2022 09:12:08 +0100
- To: Jonas Smedegaard <jonas@jones.dk>, Kingsley Idehen <kidehen@openlinksw.com>, public-webid@w3.org
- Message-ID: <e3f0a022-1ed0-e7b3-5946-19540be9890c@informatik.uni-leipzig.de>
Hi Jonas, On 22.01.22 01:09, Jonas Smedegaard wrote: > Oh well. > > I understand your desire to simplify, I really do. > > Ruben Verborgh also wrote about that desire in his latest blog entry: > https://ruben.verborgh.org/blog/2021/12/23/reflections-of-knowledge/ > > He links to a single paragraph by Dan Brickley and Libby Miller, about > that complexity issue:https://book.validatingrdf.com/bookHtml005.html > > Let me quote here the first two sentences of that paragraph: > >> People think RDF is a pain because it is complicated. The truth is >> even worse. RDF is painfully simplistic, but it allows you to work >> with real-world data and problems that are horribly complicated. I will try to phrase it in a diplomatic manner: It kind of became a recent trend to talk down Linked Data achievements. At the core, lies the fact that maybe 50% of the datasets in the LOD-Cloud have become stale/unreachable. Now the LOD-Cloud is pretty much manually curated and resources are missing to properly keep it updated. So some people start saying that it is going down. However, 50% is still 95% better than other approaches to put data on the web. I see huge non-LD "data" repos that do not have many download and if you count it, tey amount to 10k downloads over 5 years or so. Basically, Linked Data already achieved FAIR. Then some core people of the community are repeating perseverance slogans (not meaning Kingsley here in particular, he is more educational) but ignoring the fact that there are some problems that we would need to address in order to make it fly. Not being able to update the LOD Cloud properly (by automatic crawling) is one of the things. Why is that? I would see an identity problem, i.e. what is the identity of the bubbles, also the lack of WebID for people/orgs publishing data, then also no discovery mechanism. Also the question here: is it for the lack of infrastructure (nobody doing it) or the lack of a feature/patch to the system. Still more things in RDF, that are not complicated, but painful. Wiggle space of JSON-LD is one, basically this sentence by Aaron: > Without a well-defined context, however, the vagaries in > compact/expanded/flattened JSON-LD serializations provide a high bar > for data parsing, and you lose a lot of the advantages that JSON-LD > has to offer in the first place. In fact, when given the choice > between Turtle (or other RDF serializations) and JSON-LD without a > structured context, I would always choose Turtle. That is an insight from somebody who has taken the effort of digging through complex things to find a technical best practice how to work with RDF in a simple manner. Simplicity doesn't come up-front, but has to be discovered. Then there are may small-scale issues that we could avoid besides the JSON issues: 1. upgrading tooling to xsd:string as given by RDF 1.1, 2. I don't remember correctly, but we encountered a ";" problem with turtle cert:key [ <a> <b> "" ; ] ; vs. cert:key [ <a> <b> "" ] ; 3. DBpedia's CTO Kontokostas, my PhD student, created SHACL, because we wanted to patch a particular gap in RDF. By using more SHACL to define RDF a lot can be achieved. This issue is also related to the current spec: https://www.w3.org/2005/Incubator/webid/spec/identity/ : a) foaf:img -> URI in plain literal, b) foaf:name with xsd:string, language tag or without, c) datatypes for <http://www.w3.org/ns/auth/cert#modulus> are defined as :range <http://www.w3.org/2001/XMLSchema#base64Binary>, <http://www.w3.org/2001/XMLSchema#hexBinary> ; which means they are always both per inference, so in the actual WebID doc, you can put both, one or none. 4. for https://github.com/dbpedia/databus / databus.dbpedia.org we implemented WebID at first, but e.g. on an Apple the keystore kept popping up immediately, so people thought the website was password protected. There is definitely no help or guidance or standard that tells web site creators how to implement the WebID login properly, which would help adoption and also influence browsers to make a user-friendly certificate authentication as a core feature. We removed it. 5. Regarding WebID, we tried to have people create this in their own space, but it was a mess. We tried to fix this with SHACL https://github.com/dbpedia/wall-of-fame/blob/master/src/main/resources/shacl/shapes.ttl . In the end, Databus does the following now: Each account comes with a webid by appending #me , i.e. https://databus.dbpedia.org/kurzum#me The feature is not yet deployed online, but in the Github repo. Then we thought it would be good to provide some metadata for the Databus itself and my developer asked me how to do it, e.g. <https://databus.dbpedia.org> a dataid:Databus ; dct:hasVersion "2.0b" . Even I am struggling with this, i.e. is it https://databus.dbpedia.org or https://databus.dbpedia.org/ or https://databus.dbpedia.org#this , https://databus.dbpedia.org/#this , or 303 to https://databus.dbpedia.org/webid.ttl#this ? Or put it into .well-known or robots.txt ? My main point here is: It could be simple and if you have a lot of experience it might become simpler. Beginners are struggling with a plethora of hard micro decisions. This could be avoided by 1. tackling technical details e.g. SHACL, context, providing an official validator and 2. maybe not mandating, but giving one simple way that can be adapted without taking micro decisions. -- Sebastian
Received on Saturday, 22 January 2022 08:12:29 UTC