Re: Updated LOD Cloud Diagram - Please enter your linked datasets into the datahub.io catalog for inclusion.

On 2014-07-24 14:18, Christian Bizer wrote:
> Hi all,
>
> Max Schmachtenberg, Heiko Paulheim and I have crawled of the Web of
> Linked Data and have drawn an updated LOD Cloud diagram based on the
> results of the crawl.
>
> This diagram showing all linked datasets that our crawler managed to
> discover in April 2014 is found here:
>
> http://data.dws.informatik.uni-mannheim.de/lodcloud/2014/ISWC-RDB/LODCloudDiagram.png
>
> We also analyzed the compliance of the different datasets with the
> Linked Data best practices and a paper presenting the results of the
> analysis is found below. The paper will appear at ISWC 2014 in the
> Replication, Benchmark, Data and Software Track.
>
> http://dws.informatik.uni-mannheim.de/fileadmin/lehrstuehle/ki/pub/SchmachtenbergBizerPaulheim-AdoptionOfLinkedDataBestPractices.pdf
>
> The raw data used for our analysis is found on this page:
>
> http://data.dws.informatik.uni-mannheim.de/lodcloud/2014/ISWC-RDB/
>
> Our crawler did discover 77 dataset that do not allow crawling via their
> robots.txt files and these datasets were not included into our analysis
> and are also not included in the current version of the LOD Cloud diagram.
>
> A list of these datasets is found at
> http://data.dws.informatik.uni-mannheim.de/lodcloud/2014/ISWC-RDB/tables/notCrawlableDatasets.tsv
>
> In order to give a comprehensive overview of all Linked Data sets that
> are currently online, we would like to draw another version of the LOD
> Cloud diagram including the datasets that our crawler has missed as well
> as the datasets that do not allow crawling.
>
> Thus, if you publish or know about linked datasets that are not in the
> diagram or in the list of not crawlable datasets yet, please:
>
> 1.Enter them into the datahub.io data catalog until August 8^th .
>
> 2.Tag them in the catalog with the tag ‘lod’
> (http://datahub.io/dataset?tags=lod)
>
> 3.Send an email to Max and Chris pointing us at the entry in the catalog.
>
> We will include all datasets into the updated version of the cloud
> diagram, that fulfill the following requirements:
>
> 1.Data items are accessible via dereferencable URIs.
>
> 2.The dataset sets at least 50 RDF links pointing at other datasets or
> at least one other dataset is setting 50 RDF links pointing at your dataset.
>
> Instructions on how to describe your dataset in the catalog are found here:
>
> https://www.w3.org/wiki/TaskForces/CommunityProjects/LinkingOpenData/DataSets/CKANmetainformation
>
> Please make sure that you include information about the RDF links
> pointing from your dataset into other datasets (field links: ) as well
> as a tag indicating the topical category of your dataset, so that we
> know how to include it into the diagram.
>
> Please also include an example URI from your dataset into the catalog.
>
> We will start to review the new datasets and to draw the updated version
> of the LOD cloud diagram after August 8^th .
>
> So please point us at datasets to be included before this date.
>
> Cheers,
>
> Max, Heiko, and Chris
>
> --
>
> Prof. Dr. Christian Bizer
>
> Data and Web Science Research Group
>
> Universität Mannheim, Germany
> chris@informatik.uni-mannheim.de
>
> www.bizer.de
>

Thank you Chris, Max, Heiko, Andreas, Tobias, et al.,

I find that a diagram based on the crawlable LOD better captures the 
essence of "LOD" than the alternative methods e.g., curation based on 
catalog groups.

What this diagram reveals is that, the LOD landscape is very dynamic and 
dare I say, not so pretty looking. I suspect that people are going to 
update their presentations and articles from what comes out of this 
effort. And since the LOD Cloud diagrams of the past helped tremendously 
to put a face on the L(O)D effort, I humbly suggest that the next 
version of this diagram should give a bit more attention to its 
presentation:

* The visualisation should speak for itself: "This is LOD crawled". The 
problem with this diagram is exactly the shortcoming of that. It is 
sticking to the rules of the previous diagrams, meanwhile trying to 
communicate something completely different.

* Consider: are node sizes relevant? arc density? which domains be 
captured? must all nodes be labeled (cut off point)? should the clusters 
be based on their linkage as opposed to their domain?

* In SVG and legible for a ~640px width in portrait (people would want 
to put it on (nearly fixed) views: slides, papers) - as a user of the 
LOD diagram, I don't want people to squint at the "magnificence" of LOD 
and not see anything more than "dbpedia" and bunch of circles with 
pastel colours.

* Use more of the available rectangular space. It doesn't have to be a 
perfectly shaped ellipse.

In summary: the visualisation should be created from complete scratch.


Now, having said all that, can I have my dataspaces in cornflower blue?

-Sarven
http://csarven.ca/#i

Received on Friday, 25 July 2014 08:00:45 UTC