- From: John McCrae <john@mccr.ae>
- Date: Thu, 22 Jun 2017 14:18:48 +0100
- Cc: Andrejs Abele <andrejs.abele@insight-centre.org>, public-lod@w3.org
- Message-ID: <CAC5njqrb97B2wmXSo0ZrBaVAKKwoNPZB_b2B_ZCdhULpYxO7og@mail.gmail.com>
Hi all, Thank you for your suggestions, apologies that we have taken so long to reply we have been quite busy <http://ldk2017.org>. Firstly due to unforeseen circumstances, we have had to delayed the next diagram until later in July. While Datahub is certainly not an ideal tool for collecting metadata about LOD datasets, it functions well enough for most people and has a big advantage in that most of the data we need is already there. We are currently working on improving the metadata procedure behind the LOD Cloud Diagram and in fact our goal is to eliminate the need for data providers to inform us about their datasets. Instead, we are planning to extract the topic, size and links in a dataset from the available data and discover new datasets by crawling (this sounds easier than it is!). As such, our plans for the short term still involve using Datahub as our UI and data collection point, but hopefully to an increasingly lesser extent as we automate more of the process of generating the metadata required for the diagram. Regards, John P. McCrae On Fri, Jun 16, 2017 at 12:57 PM, Sarven Capadisli <info@csarven.ca> wrote: > On 2017-06-16 13:05, Andrejs Abele wrote: > > ** > > > > *Hi everyone,* > > > > * > > > > > > > > We want to address some of the comments and suggestions mentioned in > > this thread. > > > > > > > > While we are keen to use metadata from resource publishers wherever > > possible, we still require that your resource is listed in Datahub as we > > do not have the capability to crawl the Web looking for VoID > > descriptions at this moment. > > > > * > > > > In DataHub there is an option to upload VoID descriptions. Please > > upload here and we will attempt to extract the metadata from the > > VoID file. > > > > * > > > > For VoID file to be useful, it would have to contain > > "dcterms:subject" property that describes the topics contained in > > the dataset and "void:Linkset" describing links to other datasets > > and number of triples linking to said dataset. The target of any > > linkset must correspond to a VoID file listed in Datahub; e.g., to > > link to European Nature Information System (EUNIS) you should link > > to the VoID description listed at Datahub: > > http://eunis.eea.europa.eu/void.rdf. > > > > * > > > > For now, if you have VoID file that contains this information, and > > for any reason you don't want to publish it on DataHub, you can send > > it to us and we will add it to our system. > > > > > > > > We are aware that there is a dataset validation tool > > (http://validator.lod-cloud.net/) available, but we are not currently > > maintaining it our using it to check the validity of a dataset. If you > > are unsure as to whether your Datahub record is suitable please email us > > and we will check that it appears correctly in the diagram.: > > > > > > > > As there have been multiple requests, we will postpone the generation > > of the diagram till next weekend (24.06.17) > > > > > > > > > > > > Kind regards, > > > > Andrejs Abele and John P. McCrae > > > > > > > > Unit for Natural Language Processing > > > > Insight Centre for Data Analytics > > > > National University of Ireland Galway > > > > https://nuig.insight-centre.org/unlp/people/members/andrejs-abele/ > > > > http://john.mccr.ae* > > > > > > On 14/06/17 10:45, Victor de Boer @ VU wrote: > >> Dear Andrejs & John, > >> > >> Indeed, we would really like to see our GTAA thesaurus > >> (https://datahub.io/dataset/gemeenschappelijke-thesaurus- > audiovisuele-archieven) > >> in the cloud diagram. The validator tool gives a number of > >> non-compliance messages. > >> Missing URL. Please provide an URL for the data set. > >> Missing authorship. Please provide the name of publishing org and/or > >> person using the CKAN field Author. It is important to know who > >> created this data set. > >> Missing lod tag. Please tag the data set with lod. > >> Missing contact email. Please provide a contact email using the CKAN > >> field Author email or Maintainer email. It is important to know who to > >> contact if there are errors or missing dataset descriptions. > >> > >> However, looking at the Datahub metadata, as far as I can see, all the > >> required fields are there (URL, Author, links, etc) > >> > >> Any idea how we can make sure that the GTAA (and other datasets) will > >> appear in the new diagram? > >> > >> thanks! > >> --victor > >> > >> RE: > >> Dear Andrejs & John, > >> > >> Great that you guys are committed in this task. > >> > >> Just a remark: before a dataset is considered in the LOD cloud, it > >> must comply with the guidelines described > >> at https://www.w3.org/wiki/TaskForces/CommunityProjects/ > LinkingOpenData/DataSets/CKANmetainformation > >> <https://www.w3.org/wiki/TaskForces/CommunityProjects/ > LinkingOpenData/DataSets/CKANmetainformation>. > >> The document refers to the dataset validation tool > >> (http://validator.lod-cloud.net/ <http://validator.lod-cloud.net/>) to > >> figure out the "completeness level", that is why a given dataset > >> published on datahub.io <http://datahub.io/> does or does not comply > >> with those guidelines. > >> > >> As I already mentioned on this list > >> (https://lists.w3.org/Archives/Public/public-lod/2017Feb/0001.html > >> <https://lists.w3.org/Archives/Public/public-lod/2017Feb/0001.html>), > >> this tool seems buggy: the search form keeps returning the same page > >> about compliance levels, but does not give data about any specific > >> dataset, including those already listed. > >> > >> As a result, I would assume that existing datasets are not included in > >> the LOD cloud just because publishers don' t know what needs to be > fixed. > >> > >> Are you aware of this issue? Do you know who could to contact? > >> > >> Thx, > >> Franck. > >> > >> Le 12/06/2017 à 18:19, Andrejs Abele a écrit : > >>> > >>> Hi everyone, > >>> > >>> > >>> > >>> The Linked Open Data Cloud Diagram (http://lod-cloud.net > >>> <http://lod-cloud.net/>) is one of the most visible tools in our > >>> community and we at the Insight Centre for Data Analytics have > >>> committed to providing regular updates to this diagram. > >>> > >>> > >>> > >>> We are planing to generate the next version of the LOD cloud diagram, > >>> at the end of this week (17.06.17) > >>> > >>> > >>> > >>> In order to help us best reflect the true state of the Linked Open > >>> Data Cloud, please update your resource description in DataHub.io > >>> (https://datahub.io <https://datahub.io/>) based on guidelines below > >>> by this Friday. > >>> > >>> * > >>> > >>> Provide tags describing your dataset > >>> > >>> * > >>> > >>> Provide number of triples > >>> > >>> * > >>> > >>> Provide information about links to other datasets in format: > >>> > >>> o > >>> > >>> links:<resource id in DataHub> > >>> > >>> o > >>> > >>> E.g., links:dbpedia > >>> > >>> For more details please see the LOD Cloud Diagram Page or the > >>> detailed description here: > >>> > >>> https://www.w3.org/wiki/TaskForces/CommunityProjects/ > LinkingOpenData/DataSets/CKANmetainformation > >>> <https://www.w3.org/wiki/TaskForces/CommunityProjects/ > LinkingOpenData/DataSets/CKANmetainformation> > >>> > >>> > >>> > >>> Kind regards, > >>> > >>> Andrejs Abele and John P. McCrae > >>> > >>> > >>> > >>> Unit for Natural Language Processing > >>> > >>> Insight Centre for Data Analytics > >>> > >>> National University of Ireland Galway > >>> > >>> https://nuig.insight-centre.org/unlp/people/members/andrejs-abele/ > >>> <https://nuig.insight-centre.org/unlp/people/members/andrejs-abele/> > >>> > >>> http://john.mccr.ae <http://john.mccr.ae/> > >> > > > > -- > > Unit for Natural Language Processing > > Insight Centre for Data Analytics > > National University of Ireland Galway > > https://nuig.insight-centre.org/unlp/people/members/andrejs-abele/ > > > > Just to put this on the table. There is a "follow your nose" approach > that can be incorporated here that I believe can address a bunch of > technical and social hurdles for both dataset owners and the consumers > (like for the preparation of the LOD cloud). > > Have a discoverable relation to receive notifications about dataset > updates. You can decide on the subject and object URL: > > <http://lod-cloud.net/> <http://www.w3.org/ns/ldp#inbox> > <http://lod-cloud.net/inbox/> . > > Allow POST requests with JSON-LD (and other RDF syntaxes if you'd like) > on the Inbox URL. > > Dataset owners can send a payload indicating where to discover their > datasets, and provenance level data that you'd be interested in knowing. > You can kindly ask what to include in the payload (eg on the homepage) > or set constraints on the Inbox etc. > > In this way, you are not bound to a 3rd party service (no account > creations, or information which might go stale if not updated). There is > also no need to have a call where people should suddenly update their > metadata on some third party service by the end of week. People can > *notify* you with "hey, I just updated my stuff over here, come and > check it out!" You have the benefit of also keeping an eye on what's > actively maintained. > > For dataset owners, notifying you can be automated in their tooling or > done manually with a simple curl -X POST. > > You can take the notifications, process and manage them as you like. In > fact, you can probably programmatically update the LOD cloud (SVG) > through these notifications. One other benefit here is that other > applications (from the community) can consume these notifications as > well if you are inclined to make the inbox/notifications with public > read access. You can serve the notifications as JSON-LD or if you allow > content negotiation, have other RDF serialisations. > > That would be the Linked Data Notifications [1] approach for this. > Compare this to the amount of manual intervention that's required of > everyone (publisher and consumer) via datahub.io. > > I'd love to see this sort of "Webby" notifications, discovery, reuse > going forward. > > If this way of working interests you and the community, let's do it. > Happy to help out where necessary. See also [2] for existing LDN > implementations where you might want to reuse code from. > > There we have it. It is all very simple. > > [1] https://www.w3.org/TR/ldn/ > [2] https://linkedresearch.org/ldn/tests/summary > > -Sarven > http://csarven.ca/#i > >
Received on Thursday, 22 June 2017 13:19:22 UTC