- From: Hugh Glaser <hugh@glasers.org>
- Date: Fri, 25 Jul 2014 17:05:42 +0100
- To: Christian Bizer <chris@bizer.de>
- Cc: Mike Liebhold <mnl@well.com>, public-lod@w3.org
Thanks Chris, Great stuff. Maybe I’ll change the robots.txt - but I may need to buy more disk space for caching before I do :-), or flush the cache more aggressively when I know spidering is happening. It is an awesome picture!! Previously I was doubtful whether the next version would give much added value, but it really does. Very best Hugh On 25 Jul 2014, at 11:12, Christian Bizer <chris@bizer.de> wrote: > Hi Hugh, > > thank you very much for your feedback :-) > > Yes, your data sources and all data sources in this list > > http://data.dws.informatik.uni-mannheim.de/lodcloud/2014/ISWC-RDB/tables/not > CrawlableDatasets.tsv > > will reappear in the final version. > > Freebase is heavily interlinked from DBpedia and also gives you something > back if you dereference their URIs like http://rdf.freebase.com/ns/m.0156q > We will check why LDspider did not manage to retrieve data from freebase > (Andreas: Thank you for your explanation on the topic) > > Does anybody know if :baseKB is served via dereferencable URIs and if they > set any links pointing at other data sets? > > If yes, we would love to include them into the final version of the diagram. > > Cheers, > > Chris > > > -----Ursprüngliche Nachricht----- > Von: Hugh Glaser [mailto:hugh@glasers.org] > Gesendet: Freitag, 25. Juli 2014 01:07 > An: Mike Liebhold > Cc: Christian Bizer; public-lod@w3.org > Betreff: Re: Updated LOD Cloud Diagram - Please enter your linked datasets > into the datahub.io catalog for inclusion. > > Awesome achievement, Chris and team! > > Yes Mike, there is quite a lot missing from the LOD Cloud we have grown to > know and love. > Some of that is I understand because it says it only has stuff that allowed > spidering (that is, robots.txt permitted it, etc.). > (I notice this because it means everything I used to have in the LOC Cloud > has disappeared!) However, the announcement message says that these sets > will re-appear, so that is good. > I don’t know if that applies to Freebase; and I think :baseKB is not there > either, but maybe that doesn’t have any links. > > I have to say that it is not clear to me that it is good practice to refer > to this image as the current/updated "version of the LOD Cloud diagram”. > It seems that you didn’t understand the significance of this from Chris’ > message, and I suspect that you will not be alone. > > Best > Hugh > > On 24 Jul 2014, at 23:39, Mike Liebhold <mnl@well.com> wrote: > >> I recall earlier versions of the LOD Cloud diagram included freebase - I > don't see it here, - or the google knowledge graph either. >> >> am I missing something? >> >> ?? >> >> >> On 7/24/14, 5:18 AM, Christian Bizer wrote: >>> Hi all, >>> >>> Max Schmachtenberg, Heiko Paulheim and I have crawled of the Web of > Linked Data and have drawn an updated LOD Cloud diagram based on the results > of the crawl. >>> >>> This diagram showing all linked datasets that our crawler managed to > discover in April 2014 is found here: >>> >>> http://data.dws.informatik.uni-mannheim.de/lodcloud/2014/ISWC-RDB/LOD >>> CloudDiagram.png >>> >>> We also analyzed the compliance of the different datasets with the Linked > Data best practices and a paper presenting the results of the analysis is > found below. The paper will appear at ISWC 2014 in the Replication, > Benchmark, Data and Software Track. >>> >>> http://dws.informatik.uni-mannheim.de/fileadmin/lehrstuehle/ki/pub/Sc >>> hmachtenbergBizerPaulheim-AdoptionOfLinkedDataBestPractices.pdf >>> >>> The raw data used for our analysis is found on this page: >>> >>> http://data.dws.informatik.uni-mannheim.de/lodcloud/2014/ISWC-RDB/ >>> >>> Our crawler did discover 77 dataset that do not allow crawling via their > robots.txt files and these datasets were not included into our analysis and > are also not included in the current version of the LOD Cloud diagram. >>> >>> A list of these datasets is found at >>> http://data.dws.informatik.uni-mannheim.de/lodcloud/2014/ISWC-RDB/tab >>> les/notCrawlableDatasets.tsv >>> >>> In order to give a comprehensive overview of all Linked Data sets that > are currently online, we would like to draw another version of the LOD Cloud > diagram including the datasets that our crawler has missed as well as the > datasets that do not allow crawling. >>> >>> Thus, if you publish or know about linked datasets that are not in the > diagram or in the list of not crawlable datasets yet, please: >>> >>> 1. Enter them into the datahub.io data catalog until August 8th. >>> 2. Tag them in the catalog with the tag ‘lod’ > (http://datahub.io/dataset?tags=lod) >>> 3. Send an email to Max and Chris pointing us at the entry in the > catalog. >>> >>> We will include all datasets into the updated version of the cloud > diagram, that fulfill the following requirements: >>> >>> 1. Data items are accessible via dereferencable URIs. >>> 2. The dataset sets at least 50 RDF links pointing at other > datasets or at least one other dataset is setting 50 RDF links pointing at > your dataset. >>> >>> Instructions on how to describe your dataset in the catalog are found > here: >>> >>> https://www.w3.org/wiki/TaskForces/CommunityProjects/LinkingOpenData/ >>> DataSets/CKANmetainformation >>> >>> Please make sure that you include information about the RDF links > pointing from your dataset into other datasets (field links: ) as well as a > tag indicating the topical category of your dataset, so that we know how to > include it into the diagram. >>> Please also include an example URI from your dataset into the catalog. >>> >>> We will start to review the new datasets and to draw the updated version > of the LOD cloud diagram after August 8th. >>> So please point us at datasets to be included before this date. >>> >>> Cheers, >>> >>> Max, Heiko, and Chris >>> >>> >>> -- >>> Prof. Dr. Christian Bizer >>> Data and Web Science Research Group >>> Universität Mannheim, Germany >>> chris@informatik.uni-mannheim.de >>> www.bizer.de >>> >> >> > > -- > Hugh Glaser > 20 Portchester Rise > Eastleigh > SO50 4QS > Mobile: +44 75 9533 4155, Home: +44 23 8061 5652 > > > > > -- Hugh Glaser 20 Portchester Rise Eastleigh SO50 4QS Mobile: +44 75 9533 4155, Home: +44 23 8061 5652
Received on Friday, 25 July 2014 16:07:06 UTC