- From: Kingsley Idehen <kidehen@openlinksw.com>
- Date: Wed, 14 Apr 2010 09:01:09 -0400
- To: Leigh Dodds <leigh.dodds@talis.com>
- CC: Ivan Mikhailov <imikhailov@openlinksw.com>, baran <baran@goldmail.de>, semanticweb <semanticweb@yahoogroups.com>, public-lod <public-lod@w3.org>, SW-forum <semantic-web@w3.org>, dbpedia-discussion <dbpedia-discussion@lists.sourceforge.net>, dbpedia-announcements <dbpedia-announcements@lists.sourceforge.net>, Chris Bizer <chris@bizer.de>
Leigh Dodds wrote: > Hi, > > 2010/4/14 Ivan Mikhailov <imikhailov@openlinksw.com>: > >> Similarly, growing database size and growing hit rate and growing >> complexity of queries are not obviously visible from outside, but turn >> the hosting into a race. We're improving the underlaying RDBMS as fast >> as we only can just to prevent the service from total halt. One might >> wish to provide a better service on their own RDBMS and thus to make a >> good advertisement, but nobody else want to do that _and_ can do that, >> so we're alone under this load. >> > > Out of interest, do you actually share any metrics on usage levels, > common sparql queries, etc? > > We have a copy of the dbpedia data loaded into the Talis Platform, but > its not yet up to date with 3.5. So there's more than one option > already. Although the service characteristics/features are different > (different software) > > Cheers, > > L. > > Leigh, When we refer to an "option" we are talking about a mirror rather than an alternative place where DBpedia data sets have been loaded. As for usage levels, the issues have very little to do we sane SPARQL query and everything to do with crawlers that actually attempt to perform wholesale imports of the entire data set (many attempt this as we can seen from the HTTP logs and the payload sizes). In addition, remember, we are severing up actual RDF based descriptor resources, and these too are crawled wholesale with the intent of populating other data spaces (these are also crawled aggressively via LOD and non LOD crawlers). We are not just providing a SPARQL endpoint, we are also serving RDF descriptor resources in a variety of representation formats. And as I've stated above, the dominant use pattern is crawling the RDF descriptor resources, which (without protection) simply obliterates "across the wire bandwidth" as is the case with any document server on a public network such as the World Wide Web. If you want to offer a mirror (i.e. one that mirrors what we are offering) then simply let us know, and we can then spell out what that entails etc.. -- Regards, Kingsley Idehen President & CEO OpenLink Software Web: http://www.openlinksw.com Weblog: http://www.openlinksw.com/blog/~kidehen Twitter/Identi.ca: kidehen
Received on Wednesday, 14 April 2010 13:01:52 UTC