Re: Please update your resource in the LOD Cloud Diagram

Can somebody fix the validation tool? That would be very helpful.

m.

On Thu, Jun 22, 2017 at 3:18 PM, John McCrae <john@mccr.ae> wrote:

> Hi all,
>
> Thank you for your suggestions, apologies that we have taken so long to
> reply we have been quite busy <http://ldk2017.org>. Firstly due to
> unforeseen circumstances, we have had to delayed the next diagram until
> later in July.
>
> While Datahub is certainly not an ideal tool for collecting metadata about
> LOD datasets, it functions well enough for most people and has a big
> advantage in that most of the data we need is already there.
>
> We are currently working on improving the metadata procedure behind the
> LOD Cloud Diagram and in fact our goal is to eliminate the need for data
> providers to inform us about their datasets. Instead, we are planning to
> extract the topic, size and links in a dataset from the available data and
> discover new datasets by crawling (this sounds easier than it is!). As
> such, our plans for the short term still involve using Datahub as our UI
> and data collection point, but hopefully to an increasingly lesser extent
> as we automate more of the process of generating the metadata required for
> the diagram.
>
> Regards,
> John P. McCrae
>
> On Fri, Jun 16, 2017 at 12:57 PM, Sarven Capadisli <info@csarven.ca>
> wrote:
>
>> On 2017-06-16 13:05, Andrejs Abele wrote:
>> > **
>> >
>> > *Hi everyone,*
>> >
>> > *
>> >
>> >
>> >
>> > We want to address some of the comments and  suggestions mentioned in
>> > this thread.
>> >
>> >
>> >
>> > While we are keen to use metadata from resource publishers wherever
>> > possible, we still require that your resource is listed in Datahub as we
>> > do not have the capability to crawl the Web looking for VoID
>> > descriptions at this moment.
>> >
>> >   *
>> >
>> >     In DataHub there is an option to upload VoID descriptions. Please
>> >     upload here and we will attempt to extract the metadata from the
>> >     VoID file.
>> >
>> >   *
>> >
>> >     For VoID file to be useful, it would have to contain
>> >      "dcterms:subject" property that describes the topics contained in
>> >     the dataset and "void:Linkset" describing links to other datasets
>> >     and number of triples linking to said dataset. The target of any
>> >     linkset must correspond to a VoID file listed in Datahub; e.g., to
>> >     link to European Nature Information System (EUNIS) you should link
>> >     to the VoID description listed at Datahub:
>> >     http://eunis.eea.europa.eu/void.rdf.
>> >
>> >   *
>> >
>> >     For now, if you have VoID file that contains this information, and
>> >     for any reason you don't want to publish it on DataHub, you can send
>> >     it to us and we will add it to our system.
>> >
>> >
>> >
>> > We are aware that there is a dataset validation tool
>> > (http://validator.lod-cloud.net/) available, but we are not currently
>> > maintaining it our using it to check the validity of a dataset. If you
>> > are unsure as to whether your Datahub record is suitable please email us
>> > and we will check that it appears correctly in the diagram.:
>> >
>> >
>> >
>> > As there have been multiple requests, we will postpone the  generation
>> > of the diagram till next weekend (24.06.17)
>> >
>> >
>> >
>> >
>> >
>> > Kind regards,
>> >
>> > Andrejs Abele and John P. McCrae
>> >
>> >
>> >
>> > Unit for Natural Language Processing
>> >
>> > Insight Centre for Data Analytics
>> >
>> > National University of Ireland Galway
>> >
>> > https://nuig.insight-centre.org/unlp/people/members/andrejs-abele/
>> >
>> > http://john.mccr.ae*
>> >
>> >
>> > On 14/06/17 10:45, Victor de Boer @ VU wrote:
>> >> Dear Andrejs & John,
>> >>
>> >> Indeed, we would really like to see our GTAA thesaurus
>> >> (https://datahub.io/dataset/gemeenschappelijke-thesaurus-aud
>> iovisuele-archieven)
>> >> in the cloud diagram. The validator tool gives a  number of
>> >> non-compliance messages.
>> >> Missing URL. Please provide an URL for the data set.
>> >>  Missing authorship. Please provide the name of publishing org and/or
>> >> person using the CKAN field Author. It is important to know who
>> >> created this data set.
>> >>  Missing lod tag. Please tag the data set with lod.
>> >> Missing contact email. Please provide a contact email using the CKAN
>> >> field Author email or Maintainer email. It is important to know who to
>> >> contact if there are errors or missing dataset descriptions.
>> >>
>> >> However, looking at the Datahub metadata, as far as I can see, all the
>> >> required fields are there (URL, Author, links, etc)
>> >>
>> >> Any idea how we can make sure that the GTAA (and other datasets) will
>> >> appear in the new diagram?
>> >>
>> >> thanks!
>> >> --victor
>> >>
>> >> RE:
>> >> Dear Andrejs & John,
>> >>
>> >> Great that you guys are committed in this task.
>> >>
>> >> Just a remark: before a dataset is considered in the LOD cloud, it
>> >> must comply with the guidelines described
>> >> at https://www.w3.org/wiki/TaskForces/CommunityProjects/Linking
>> OpenData/DataSets/CKANmetainformation
>> >> <https://www.w3.org/wiki/TaskForces/CommunityProjects/Linkin
>> gOpenData/DataSets/CKANmetainformation>.
>> >> The document refers to the dataset validation tool
>> >> (http://validator.lod-cloud.net/ <http://validator.lod-cloud.net/>) to
>> >> figure out the "completeness level", that is why a given dataset
>> >> published on datahub.io <http://datahub.io/> does or does not comply
>> >> with those guidelines.
>> >>
>> >> As I already mentioned on this list
>> >> (https://lists.w3.org/Archives/Public/public-lod/2017Feb/0001.html
>> >> <https://lists.w3.org/Archives/Public/public-lod/2017Feb/0001.html>),
>> >> this tool seems buggy: the search form keeps returning the same page
>> >> about compliance levels, but does not give data about any specific
>> >> dataset, including those already listed.
>> >>
>> >> As a result, I would assume that existing datasets are not included in
>> >> the LOD cloud just because publishers don' t know what needs to be
>> fixed.
>> >>
>> >> Are you aware of this issue? Do you know who could to contact?
>> >>
>> >> Thx,
>> >>    Franck.
>> >>
>> >> Le 12/06/2017 à 18:19, Andrejs Abele a écrit :
>> >>>
>> >>> Hi everyone,
>> >>>
>> >>>
>> >>>
>> >>> The Linked Open Data Cloud Diagram (http://lod-cloud.net
>> >>> <http://lod-cloud.net/>) is one of the most visible tools in our
>> >>> community and we at the Insight Centre for Data Analytics have
>> >>> committed to providing regular updates to this diagram.
>> >>>
>> >>>
>> >>>
>> >>> We are planing to generate the next version of the LOD cloud diagram,
>> >>> at the end of this week (17.06.17)
>> >>>
>> >>>
>> >>>
>> >>> In order to help us best reflect the true state of the Linked Open
>> >>> Data Cloud, please update your resource description in DataHub.io
>> >>> (https://datahub.io <https://datahub.io/>) based on guidelines below
>> >>> by this Friday.
>> >>>
>> >>>  *
>> >>>
>> >>>     Provide tags describing your dataset
>> >>>
>> >>>  *
>> >>>
>> >>>     Provide number of triples
>> >>>
>> >>>  *
>> >>>
>> >>>     Provide information about links to other datasets in format:
>> >>>
>> >>>      o
>> >>>
>> >>>         links:<resource id in DataHub>
>> >>>
>> >>>      o
>> >>>
>> >>>         E.g., links:dbpedia
>> >>>
>> >>> For more details please see the LOD Cloud Diagram Page or the
>> >>> detailed description here:
>> >>>
>> >>> https://www.w3.org/wiki/TaskForces/CommunityProjects/Linking
>> OpenData/DataSets/CKANmetainformation
>> >>> <https://www.w3.org/wiki/TaskForces/CommunityProjects/Linkin
>> gOpenData/DataSets/CKANmetainformation>
>> >>>
>> >>>
>> >>>
>> >>> Kind regards,
>> >>>
>> >>> Andrejs Abele and John P. McCrae
>> >>>
>> >>>
>> >>>
>> >>> Unit for Natural Language Processing
>> >>>
>> >>> Insight Centre for Data Analytics
>> >>>
>> >>> National University of Ireland Galway
>> >>>
>> >>> https://nuig.insight-centre.org/unlp/people/members/andrejs-abele/
>> >>> <https://nuig.insight-centre.org/unlp/people/members/andrejs-abele/>
>> >>>
>> >>> http://john.mccr.ae <http://john.mccr.ae/>
>> >>
>> >
>> > --
>> > Unit for Natural Language Processing
>> > Insight Centre for Data Analytics
>> > National University of Ireland Galway
>> > https://nuig.insight-centre.org/unlp/people/members/andrejs-abele/
>> >
>>
>> Just to put this on the table. There is a "follow your nose" approach
>> that can be incorporated here that I believe can address a bunch of
>> technical and social hurdles for both dataset owners and the consumers
>> (like for the preparation of the LOD cloud).
>>
>> Have a discoverable relation to receive notifications about dataset
>> updates. You can decide on the subject and object URL:
>>
>> <http://lod-cloud.net/> <http://www.w3.org/ns/ldp#inbox>
>> <http://lod-cloud.net/inbox/> .
>>
>> Allow POST requests with JSON-LD (and other RDF syntaxes if you'd like)
>> on the Inbox URL.
>>
>> Dataset owners can send a payload indicating where to discover their
>> datasets, and provenance level data that you'd be interested in knowing.
>> You can kindly ask what to include in the payload (eg on the homepage)
>> or set constraints on the Inbox etc.
>>
>> In this way, you are not bound to a 3rd party service (no account
>> creations, or information which might go stale if not updated). There is
>> also no need to have a call where people should suddenly update their
>> metadata on some third party service by the end of week. People can
>> *notify* you with "hey, I just updated my stuff over here, come and
>> check it out!" You have the benefit of also keeping an eye on what's
>> actively maintained.
>>
>> For dataset owners, notifying you can be automated in their tooling or
>> done manually with a simple curl -X POST.
>>
>> You can take the notifications, process and manage them as you like. In
>> fact, you can probably programmatically update the LOD cloud (SVG)
>> through these notifications. One other benefit here is that other
>> applications (from the community) can consume these notifications as
>> well if you are inclined to make the inbox/notifications with public
>> read access. You can serve the notifications as JSON-LD or if you allow
>> content negotiation, have other RDF serialisations.
>>
>> That would be the Linked Data Notifications [1] approach for this.
>> Compare this to the amount of manual intervention that's required of
>> everyone (publisher and consumer) via datahub.io.
>>
>> I'd love to see this sort of "Webby" notifications, discovery, reuse
>> going forward.
>>
>> If this way of working interests you and the community, let's do it.
>> Happy to help out where necessary. See also [2] for existing LDN
>> implementations where you might want to reuse code from.
>>
>> There we have it. It is all very simple.
>>
>> [1] https://www.w3.org/TR/ldn/
>> [2] https://linkedresearch.org/ldn/tests/summary
>>
>> -Sarven
>> http://csarven.ca/#i
>>
>>
>


-- 
Michel Dumontier
Distinguished Professor of Data Science
Maastricht University
http://dumontierlab.com

Received on Thursday, 22 June 2017 13:33:47 UTC