Re: Updated LOD Cloud Diagram -freebase and :baseKB

Thanks Chris,
Great stuff.
Maybe I’ll change the robots.txt - but I may need to buy more disk space for caching before I do :-), or flush the cache more aggressively when I know spidering is happening.

It is an awesome picture!!
Previously I was doubtful whether the next version would give much added value, but it really does.

Very best
Hugh


On 25 Jul 2014, at 11:12, Christian Bizer <chris@bizer.de> wrote:

> Hi Hugh,
> 
> thank you very much for your feedback :-)
> 
> Yes, your data sources and all data sources in this list
> 
> http://data.dws.informatik.uni-mannheim.de/lodcloud/2014/ISWC-RDB/tables/not
> CrawlableDatasets.tsv
> 
> will reappear in the final version.
> 
> Freebase is heavily interlinked from DBpedia and also gives you something
> back if you dereference their URIs like http://rdf.freebase.com/ns/m.0156q
> We will check why LDspider did not manage to retrieve data from freebase
> (Andreas: Thank you for your explanation on the topic)
> 
> Does anybody know if :baseKB is served via dereferencable URIs and if they
> set any links pointing at other data sets?
> 
> If yes, we would love to include them into the final version of the diagram.
> 
> Cheers,
> 
> Chris
> 
> 
> -----Ursprüngliche Nachricht-----
> Von: Hugh Glaser [mailto:hugh@glasers.org] 
> Gesendet: Freitag, 25. Juli 2014 01:07
> An: Mike Liebhold
> Cc: Christian Bizer; public-lod@w3.org
> Betreff: Re: Updated LOD Cloud Diagram - Please enter your linked datasets
> into the datahub.io catalog for inclusion.
> 
> Awesome achievement, Chris and team!
> 
> Yes Mike, there is quite a lot missing from the LOD Cloud we have grown to
> know and love.
> Some of that is I understand because it says it only has stuff that allowed
> spidering (that is, robots.txt permitted it, etc.).
> (I notice this because it means everything I used to have in the LOC Cloud
> has disappeared!) However, the announcement message says that these sets
> will re-appear, so that is good.
> I don’t know if that applies to Freebase; and I think :baseKB is not there
> either, but maybe that doesn’t have any links.
> 
> I have to say that it is not clear to me that it is good practice to refer
> to this image as the current/updated "version of the LOD Cloud diagram”.
> It seems that you didn’t understand the significance of this from Chris’
> message, and I suspect that you will not be alone.
> 
> Best
> Hugh
> 
> On 24 Jul 2014, at 23:39, Mike Liebhold <mnl@well.com> wrote:
> 
>> I recall earlier versions of the LOD Cloud diagram included freebase - I
> don't see it here, - or  the google knowledge graph either.
>> 
>> am I missing something?
>> 
>> ??
>> 
>> 
>> On 7/24/14, 5:18 AM, Christian Bizer wrote:
>>> Hi all,
>>> 
>>> Max Schmachtenberg, Heiko Paulheim and I have crawled of the Web of
> Linked Data and have drawn an updated LOD Cloud diagram based on the results
> of the crawl.
>>> 
>>> This diagram showing all linked datasets that our crawler managed to
> discover in April 2014 is found here:
>>> 
>>> http://data.dws.informatik.uni-mannheim.de/lodcloud/2014/ISWC-RDB/LOD
>>> CloudDiagram.png
>>> 
>>> We also analyzed the compliance of the different datasets with the Linked
> Data best practices and a paper presenting the results of the analysis is
> found below. The paper will appear at ISWC 2014 in the Replication,
> Benchmark, Data and Software Track.
>>> 
>>> http://dws.informatik.uni-mannheim.de/fileadmin/lehrstuehle/ki/pub/Sc
>>> hmachtenbergBizerPaulheim-AdoptionOfLinkedDataBestPractices.pdf
>>> 
>>> The raw data used for our analysis is found on this page:
>>> 
>>> http://data.dws.informatik.uni-mannheim.de/lodcloud/2014/ISWC-RDB/
>>> 
>>> Our crawler did discover 77 dataset that do not allow crawling via their
> robots.txt files and these datasets were not included into our analysis and
> are also not included in the current version of the LOD Cloud diagram.
>>> 
>>> A list of these datasets is found at  
>>> http://data.dws.informatik.uni-mannheim.de/lodcloud/2014/ISWC-RDB/tab
>>> les/notCrawlableDatasets.tsv
>>> 
>>> In order to give a comprehensive overview of all Linked Data sets that
> are currently online, we would like to draw another version of the LOD Cloud
> diagram including the datasets that our crawler has missed as well as the
> datasets that do not allow crawling.
>>> 
>>> Thus, if you publish or know about linked datasets that are not in the
> diagram or in the list of not crawlable datasets yet, please:
>>> 
>>> 1.       Enter them into the datahub.io data catalog until August 8th.
>>> 2.       Tag them in the catalog with the tag ‘lod’
> (http://datahub.io/dataset?tags=lod)
>>> 3.       Send an email to Max and Chris pointing us at the entry in the
> catalog.
>>> 
>>> We will include all datasets into the updated version of the cloud
> diagram, that fulfill the following requirements:
>>> 
>>> 1.       Data items are accessible via dereferencable URIs.
>>> 2.       The dataset sets at least 50 RDF links pointing at other
> datasets or at least one other dataset is setting 50 RDF links pointing at
> your dataset.
>>> 
>>> Instructions on how to describe your dataset in the catalog are found
> here:
>>> 
>>> https://www.w3.org/wiki/TaskForces/CommunityProjects/LinkingOpenData/
>>> DataSets/CKANmetainformation
>>> 
>>> Please make sure that you include information about the RDF links
> pointing from your dataset into other datasets (field links: ) as well as a
> tag indicating the topical category of your dataset, so that we know how to
> include it into the diagram.
>>> Please also include an example URI from your dataset into the catalog.
>>> 
>>> We will start to review the new datasets and to draw the updated version
> of the LOD cloud diagram after August 8th.
>>> So please point us at datasets to be included before this date.
>>> 
>>> Cheers,
>>> 
>>> Max, Heiko, and Chris
>>> 
>>> 
>>> --
>>> Prof. Dr. Christian Bizer
>>> Data and Web Science Research Group
>>> Universität Mannheim, Germany
>>> chris@informatik.uni-mannheim.de
>>> www.bizer.de
>>> 
>> 
>> 
> 
> --
> Hugh Glaser
>   20 Portchester Rise
>   Eastleigh
>   SO50 4QS
> Mobile: +44 75 9533 4155, Home: +44 23 8061 5652
> 
> 
> 
> 
> 

-- 
Hugh Glaser
   20 Portchester Rise
   Eastleigh
   SO50 4QS
Mobile: +44 75 9533 4155, Home: +44 23 8061 5652

Received on Friday, 25 July 2014 16:07:06 UTC