W3C home > Mailing lists > Public > public-lod@w3.org > April 2010

Re: DBpedia hosting burden

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Thu, 15 Apr 2010 16:59:56 -0400
Message-ID: <4BC77E4C.1070103@openlinksw.com>
To: Dan Brickley <danbri@danbri.org>
CC: Ian Davis <lists@iandavis.com>, public-lod <public-lod@w3.org>, dbpedia-discussion <dbpedia-discussion@lists.sourceforge.net>
Dan Brickley wrote:
> On Thu, Apr 15, 2010 at 9:57 PM, Kingsley Idehen <kidehen@openlinksw.com> wrote:
>> Ian Davis wrote:
>> When you use the term: SPARQL Mirror (note: Leigh's comments yesterday re.
>> not orienting towards this), you open up a different set of issues. I don't
>> want to revisit SPARQL and SPARQL extensions debate etc.. Esp. as Virtuoso's
>> SPARQL extensions are integral part of what makes the DBpedia SPARQL
>> endpoint viable, amongst other things.
> Having the same dataset available via different implementations of
> SPARQL can only be healthy. If certain extensions are necessary, this
> will only highlight their importance. If there are public services
> offering SPARQL-based access to the DBpedia datasets (or subsets) out
> there on the Web, it would be rather useful if we could have them
> linked from a single easy to find page, along with information about
> any restrictions, quirks, subsetting, or value-adding features special
> to that service. I suggest using a section in
> http://en.wikipedia.org/wiki/DBpedia for this, unless someone cares to
> handle that on dbpedia.org.

>> The burden issue is basically veering away from the key points, which are:
>> 1. Use the DBpedia instance properly
>> 2. When the instance enforces restrictions, understand that this is a
>> Virtuoso *feature* not a bug or server shortcoming.
> Yes, the showcase implementation needs to be used properly if it is
> going to survive the increasing attention developer LOD is getting. It
> is perfectly reasonable of you to make clear when there are limits
> they are for everyone's benefit.

Yep, and as promised we will publish a document, this is certainly a 
missing piece of the puzzle right now.
>> Beyond the dbpedia.org instance, there are other locations for:
>> 1. Data Sets
>> 2. SPARQL endpoints (like yours and a few others, where functionality
>> mirroring isn't an expectation).
> Is there a list somewhere of related SPARQL endpoints? (also other
> Wikipedia-derrived datasets in RDF)

See: http://delicious.com/kidehen/sparql_endpoint, that's how I track 
SPARQL endpoints, at the current time.

>> Descriptor Resource vhandling ia mirrors, BitTorrents, Reverse Proxies,
>> Cache directives, and some 303 heuristics etc.. Are the real issues of
>> interest.
> (am chatting with Daniel Koller in Skype now re the BitTorrent experiments...)

Yes, seeing progress.
>> Note: I can send wild SPARQL CONSTRUCTs, DESCRIBES, and HTTP GETs for
>> Resource Descriptors to a zillion mirrors (maybe next year's April Fool's
>> joke re. beauty of Linked Data crawling) and it will only make broaden the
>> scope of my dysfunctional behavior. The behavior itself has to be handled
>> (one or a zillion mirrors).
> Sure. But on balance, more mirrors rather than fewer should benefit
> everyone, particularly if 'good behaviour' is documented and
> enforced...

Yes, LinkedData DNS remains a personal aspiration of mine, but no matter 
what we build, enforcement needs to be understood as a *feature* rather 
than a bug or deficiency etc..
>> Anyway, we will publish our guide for working with DBpedia very soon. I
>> believe this will add immense clarity to this matter.
> Great!
> cheers,
> Dan



Kingsley Idehen	      
President & CEO 
OpenLink Software     
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen 
Received on Thursday, 15 April 2010 21:00:27 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:16:05 UTC