Re: Please update your resource in the LOD Cloud Diagram

On 2017-06-16 13:05, Andrejs Abele wrote:
> **
> 
> *Hi everyone,*
> 
> *
> 
>  
> 
> We want to address some of the comments and  suggestions mentioned in
> this thread.
> 
>  
> 
> While we are keen to use metadata from resource publishers wherever
> possible, we still require that your resource is listed in Datahub as we
> do not have the capability to crawl the Web looking for VoID
> descriptions at this moment.
> 
>   *
> 
>     In DataHub there is an option to upload VoID descriptions. Please
>     upload here and we will attempt to extract the metadata from the
>     VoID file.
> 
>   *
> 
>     For VoID file to be useful, it would have to contain
>      "dcterms:subject" property that describes the topics contained in
>     the dataset and "void:Linkset" describing links to other datasets
>     and number of triples linking to said dataset. The target of any
>     linkset must correspond to a VoID file listed in Datahub; e.g., to
>     link to European Nature Information System (EUNIS) you should link
>     to the VoID description listed at Datahub:
>     http://eunis.eea.europa.eu/void.rdf.
> 
>   *
> 
>     For now, if you have VoID file that contains this information, and
>     for any reason you don't want to publish it on DataHub, you can send
>     it to us and we will add it to our system.
> 
>  
> 
> We are aware that there is a dataset validation tool
> (http://validator.lod-cloud.net/) available, but we are not currently
> maintaining it our using it to check the validity of a dataset. If you
> are unsure as to whether your Datahub record is suitable please email us
> and we will check that it appears correctly in the diagram.:
> 
>  
> 
> As there have been multiple requests, we will postpone the  generation
> of the diagram till next weekend (24.06.17)
> 
>  
> 
>  
> 
> Kind regards,
> 
> Andrejs Abele and John P. McCrae
> 
>  
> 
> Unit for Natural Language Processing
> 
> Insight Centre for Data Analytics
> 
> National University of Ireland Galway
> 
> https://nuig.insight-centre.org/unlp/people/members/andrejs-abele/
> 
> http://john.mccr.ae*
> 
> 
> On 14/06/17 10:45, Victor de Boer @ VU wrote:
>> Dear Andrejs & John,
>>
>> Indeed, we would really like to see our GTAA thesaurus
>> (https://datahub.io/dataset/gemeenschappelijke-thesaurus-audiovisuele-archieven)
>> in the cloud diagram. The validator tool gives a  number of
>> non-compliance messages. 
>> Missing URL. Please provide an URL for the data set.
>>  Missing authorship. Please provide the name of publishing org and/or
>> person using the CKAN field Author. It is important to know who
>> created this data set.
>>  Missing lod tag. Please tag the data set with lod.
>> Missing contact email. Please provide a contact email using the CKAN
>> field Author email or Maintainer email. It is important to know who to
>> contact if there are errors or missing dataset descriptions.
>>
>> However, looking at the Datahub metadata, as far as I can see, all the
>> required fields are there (URL, Author, links, etc) 
>>
>> Any idea how we can make sure that the GTAA (and other datasets) will
>> appear in the new diagram?
>>
>> thanks!
>> --victor
>>
>> RE:
>> Dear Andrejs & John,
>>
>> Great that you guys are committed in this task.
>>
>> Just a remark: before a dataset is considered in the LOD cloud, it
>> must comply with the guidelines described
>> at https://www.w3.org/wiki/TaskForces/CommunityProjects/LinkingOpenData/DataSets/CKANmetainformation
>> <https://www.w3.org/wiki/TaskForces/CommunityProjects/LinkingOpenData/DataSets/CKANmetainformation>.
>> The document refers to the dataset validation tool
>> (http://validator.lod-cloud.net/ <http://validator.lod-cloud.net/>) to
>> figure out the "completeness level", that is why a given dataset
>> published on datahub.io <http://datahub.io/> does or does not comply
>> with those guidelines.
>>
>> As I already mentioned on this list
>> (https://lists.w3.org/Archives/Public/public-lod/2017Feb/0001.html
>> <https://lists.w3.org/Archives/Public/public-lod/2017Feb/0001.html>),
>> this tool seems buggy: the search form keeps returning the same page
>> about compliance levels, but does not give data about any specific
>> dataset, including those already listed.
>>
>> As a result, I would assume that existing datasets are not included in
>> the LOD cloud just because publishers don' t know what needs to be fixed.
>>
>> Are you aware of this issue? Do you know who could to contact?
>>
>> Thx,
>>    Franck.
>>
>> Le 12/06/2017 à 18:19, Andrejs Abele a écrit :
>>>
>>> Hi everyone,
>>>
>>>  
>>>
>>> The Linked Open Data Cloud Diagram (http://lod-cloud.net
>>> <http://lod-cloud.net/>) is one of the most visible tools in our
>>> community and we at the Insight Centre for Data Analytics have
>>> committed to providing regular updates to this diagram.
>>>
>>>  
>>>
>>> We are planing to generate the next version of the LOD cloud diagram,
>>> at the end of this week (17.06.17)
>>>
>>>  
>>>
>>> In order to help us best reflect the true state of the Linked Open
>>> Data Cloud, please update your resource description in DataHub.io
>>> (https://datahub.io <https://datahub.io/>) based on guidelines below
>>> by this Friday.
>>>
>>>  *
>>>
>>>     Provide tags describing your dataset
>>>
>>>  *
>>>
>>>     Provide number of triples
>>>
>>>  *
>>>
>>>     Provide information about links to other datasets in format:
>>>
>>>      o
>>>
>>>         links:<resource id in DataHub>
>>>
>>>      o
>>>
>>>         E.g., links:dbpedia
>>>
>>> For more details please see the LOD Cloud Diagram Page or the
>>> detailed description here:
>>>
>>> https://www.w3.org/wiki/TaskForces/CommunityProjects/LinkingOpenData/DataSets/CKANmetainformation
>>> <https://www.w3.org/wiki/TaskForces/CommunityProjects/LinkingOpenData/DataSets/CKANmetainformation>
>>>
>>>  
>>>
>>> Kind regards,
>>>
>>> Andrejs Abele and John P. McCrae
>>>
>>>  
>>>
>>> Unit for Natural Language Processing
>>>
>>> Insight Centre for Data Analytics
>>>
>>> National University of Ireland Galway
>>>
>>> https://nuig.insight-centre.org/unlp/people/members/andrejs-abele/
>>> <https://nuig.insight-centre.org/unlp/people/members/andrejs-abele/>
>>>
>>> http://john.mccr.ae <http://john.mccr.ae/>
>>
> 
> -- 
> Unit for Natural Language Processing
> Insight Centre for Data Analytics
> National University of Ireland Galway
> https://nuig.insight-centre.org/unlp/people/members/andrejs-abele/
> 

Just to put this on the table. There is a "follow your nose" approach
that can be incorporated here that I believe can address a bunch of
technical and social hurdles for both dataset owners and the consumers
(like for the preparation of the LOD cloud).

Have a discoverable relation to receive notifications about dataset
updates. You can decide on the subject and object URL:

<http://lod-cloud.net/> <http://www.w3.org/ns/ldp#inbox>
<http://lod-cloud.net/inbox/> .

Allow POST requests with JSON-LD (and other RDF syntaxes if you'd like)
on the Inbox URL.

Dataset owners can send a payload indicating where to discover their
datasets, and provenance level data that you'd be interested in knowing.
You can kindly ask what to include in the payload (eg on the homepage)
or set constraints on the Inbox etc.

In this way, you are not bound to a 3rd party service (no account
creations, or information which might go stale if not updated). There is
also no need to have a call where people should suddenly update their
metadata on some third party service by the end of week. People can
*notify* you with "hey, I just updated my stuff over here, come and
check it out!" You have the benefit of also keeping an eye on what's
actively maintained.

For dataset owners, notifying you can be automated in their tooling or
done manually with a simple curl -X POST.

You can take the notifications, process and manage them as you like. In
fact, you can probably programmatically update the LOD cloud (SVG)
through these notifications. One other benefit here is that other
applications (from the community) can consume these notifications as
well if you are inclined to make the inbox/notifications with public
read access. You can serve the notifications as JSON-LD or if you allow
content negotiation, have other RDF serialisations.

That would be the Linked Data Notifications [1] approach for this.
Compare this to the amount of manual intervention that's required of
everyone (publisher and consumer) via datahub.io.

I'd love to see this sort of "Webby" notifications, discovery, reuse
going forward.

If this way of working interests you and the community, let's do it.
Happy to help out where necessary. See also [2] for existing LDN
implementations where you might want to reuse code from.

There we have it. It is all very simple.

[1] https://www.w3.org/TR/ldn/
[2] https://linkedresearch.org/ldn/tests/summary

-Sarven
http://csarven.ca/#i

Received on Friday, 16 June 2017 12:00:25 UTC