Re: DBpedia hosting burden from Kingsley Idehen on 2010-04-14 (public-lod@w3.org from April 2010)

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Wed, 14 Apr 2010 14:41:34 -0400
To: Ross Singer <rossfsinger@gmail.com>
CC: public-lod <public-lod@w3.org>, dbpedia-discussion <dbpedia-discussion@lists.sourceforge.net>
Message-ID: <4BC60C5E.4060600@openlinksw.com>

Ross Singer wrote:
> On Wed, Apr 14, 2010 at 1:58 PM, Dan Brickley <danbri@danbri.org> wrote:
>   
>> (trimming cc: list to LOD and DBPedia)
>>     
>
> Using Dan's trimmed list to continue...
>   
>> On Wed, Apr 14, 2010 at 7:09 PM, Kingsley Idehen <kidehen@openlinksw.com> wrote:
>>     
>>> When people aren't crawling, they are executing CONSTRUCTsor DESCRIBEs
>>> via SPARQL, which is still ultimately Export from DBpedia and Import to
>>> my data space mindset.
>>>       
>
> Is this necessarily true?  Couldn't the CONSTRUCT and/or DESCRIBE
> queries be used to find resources and view the whole graph (or
> specialized subsets) to determine if it's actually what is being
> sought?
>   

I meant: the are sending a series of these query patterns with the same 
goal in mind: an export from DBpedia for import into their own Data Spaces.
> Is it better for DBpedia to do SELECTs and then retrieve the resource
> URIs individually?
>   
You can, and should use the full gamut of SPARQL queries, the issue is 
how they are used.

On our side, we've always had the ability to protect the server. In 
recent times, we simply up the ante re. protection against problematic 
behavior.

My only concern is that the tightening of control is sometimes 
misconstrued as a problem with the instance etc..

> I suppose rather than assume that the data is all being exported into
> another space (which, I would think, definitely happening -- having
> data locally aids tremendously in indexing, for example) it could be a
> case of people just using SPARQL the way it seems that SPARQL should
> work?
>   
Hence the onus is on us to make a smart server, which we've had since 
day one. Again, the issue is: when the server protects itself, the 
behavior is being misconstrued as an instance problem.

If you make a local instance of Virtuoso + DBpedia, you will see what I 
mean, and basically it would come down to what Nathan explained in this 
recent post [1]. Key excerpt:

"...The public lod and dbpedia endpoints really do no justice as to just 
how powerful and fast Virtuoso is, queries which take a few seconds on 
the public endpoint return in hundredths of a second on my local (low 
spec) server..."

Links:

1. 
http://webr3.org/blog/experiments/linked-data-extractor-prototype-details/

Kingsley
> -Ross.
>
>   


-- 

Regards,

Kingsley Idehen	      
President & CEO 
OpenLink Software     
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen

Received on Wednesday, 14 April 2010 18:42:03 UTC