- From: Kingsley Idehen <kidehen@openlinksw.com>
- Date: Tue, 08 Jun 2010 10:27:08 -0400
- To: Robert Fuller <robert.fuller@deri.org>
- CC: public-lod@w3.org
Robert Fuller wrote: > Kingsley Idehen wrote: > >> The LOD Cloud Cache at DERI is a live Virtuoso instance with 15 >> Billion+ Triples loaded. It covers as much of the LOD Cloud as we've >> be able to get our hands on plus 6.4 Billion Triples from the >> Data.Gov effort. >> >> I'll drop a more detailed note about this instance (via blog post) >> once we are done with data loading (there's a massive collection of >> eCommerce oriented Products & Services data to be loaded amongst >> others). > > I wonder is this data load the culprit responsible for the "massive > crawling"? > I don't understand how it can be. That said, there might be services out there crawling the instance (as they do DBpedia) which then leads them to the actual original data space (even though all the data is actually in the lod.openlinksw.com instance) :-( We'll double check to see that robots.txt is crystal clear re. crawl paths. -- Regards, Kingsley Idehen President & CEO OpenLink Software Web: http://www.openlinksw.com Weblog: http://www.openlinksw.com/blog/~kidehen Twitter/Identi.ca: kidehen
Received on Tuesday, 8 June 2010 14:28:06 UTC