Re: DBpedia hosting burden from Dan Brickley on 2010-04-15 (public-lod@w3.org from April 2010)

From: Dan Brickley <danbri@danbri.org>
Date: Thu, 15 Apr 2010 22:34:55 +0200
To: Kingsley Idehen <kidehen@openlinksw.com>
Cc: Ian Davis <lists@iandavis.com>, public-lod <public-lod@w3.org>, dbpedia-discussion <dbpedia-discussion@lists.sourceforge.net>
Message-ID: <o2neb19f3361004151334sf965edf0y8e4db2dab36c525a@mail.gmail.com>

On Thu, Apr 15, 2010 at 9:57 PM, Kingsley Idehen <kidehen@openlinksw.com> wrote:
> Ian Davis wrote:
>
> When you use the term: SPARQL Mirror (note: Leigh's comments yesterday re.
> not orienting towards this), you open up a different set of issues. I don't
> want to revisit SPARQL and SPARQL extensions debate etc.. Esp. as Virtuoso's
> SPARQL extensions are integral part of what makes the DBpedia SPARQL
> endpoint viable, amongst other things.

Having the same dataset available via different implementations of
SPARQL can only be healthy. If certain extensions are necessary, this
will only highlight their importance. If there are public services
offering SPARQL-based access to the DBpedia datasets (or subsets) out
there on the Web, it would be rather useful if we could have them
linked from a single easy to find page, along with information about
any restrictions, quirks, subsetting, or value-adding features special
to that service. I suggest using a section in
http://en.wikipedia.org/wiki/DBpedia for this, unless someone cares to
handle that on dbpedia.org.

> The burden issue is basically veering away from the key points, which are:
>
> 1. Use the DBpedia instance properly
> 2. When the instance enforces restrictions, understand that this is a
> Virtuoso *feature* not a bug or server shortcoming.

Yes, the showcase implementation needs to be used properly if it is
going to survive the increasing attention developer LOD is getting. It
is perfectly reasonable of you to make clear when there are limits
they are for everyone's benefit.

> Beyond the dbpedia.org instance, there are other locations for:
>
> 1. Data Sets
> 2. SPARQL endpoints (like yours and a few others, where functionality
> mirroring isn't an expectation).

Is there a list somewhere of related SPARQL endpoints? (also other
Wikipedia-derrived datasets in RDF)

> Descriptor Resource vhandling ia mirrors, BitTorrents, Reverse Proxies,
> Cache directives, and some 303 heuristics etc.. Are the real issues of
> interest.

(am chatting with Daniel Koller in Skype now re the BitTorrent experiments...)

> Note: I can send wild SPARQL CONSTRUCTs, DESCRIBES, and HTTP GETs for
> Resource Descriptors to a zillion mirrors (maybe next year's April Fool's
> joke re. beauty of Linked Data crawling) and it will only make broaden the
> scope of my dysfunctional behavior. The behavior itself has to be handled
> (one or a zillion mirrors).

Sure. But on balance, more mirrors rather than fewer should benefit
everyone, particularly if 'good behaviour' is documented and
enforced...

> Anyway, we will publish our guide for working with DBpedia very soon. I
> believe this will add immense clarity to this matter.

Great!

cheers,

Dan

Received on Thursday, 15 April 2010 20:35:30 UTC