W3C home > Mailing lists > Public > public-lod@w3.org > April 2010

Re: DBpedia hosting burden

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Thu, 15 Apr 2010 10:14:34 -0400
Message-ID: <4BC71F4A.2010601@openlinksw.com>
To: Andy Seaborne <andy.seaborne@talis.com>
CC: public-lod@w3.org, dbpedia-discussion <dbpedia-discussion@lists.sourceforge.net>
Andy Seaborne wrote:
> On 15/04/2010 2:44 PM, Kingsley Idehen wrote:
>> Andy,
>> Great stuff, this is also why we are going to leave the current DBpedia
>> 3.5 instance to stew for a while (until end of this week or a little
>> later).
>> DBpedia users:
>> Now is the time to identify problems with the DBpedia 3.5 dataset dumps.
>> We don't want to continue reloading DBpedia (Static Edition and then
>> recalibrating DBpedia-Live) based on faulty datasets related matters, we
>> do have other operational priorities etc..
> "Faulty" is a bit strong.

Imperfect then, however subjective that might be :-)
> Many of the warnings are legal RDF, but bad lexical forms for the 
> datatype, or IRIs that trigger some of the standard warnings (but they 
> are still legal IRIs).  Should they be included or not? Seems to me 
> you can argue both for and against.
> external_links_en.nt.bz2  is the largest source of broken IRIs.
> DBpedia is a wonderful and important dataset, and being derived from 
> elsewhere is unlikely to ever be "perfect" (for some definition of 
> "perfect").  Better to have the data than to wait for perfection.
That's been the approach thus far.

Anyway, as I said, we have a window of opportunity to identify current 
issues prior to performing a 3.5.1 reload. I just don't want to reduce 
the reload cycles due to other items on our todo etc..

>     Andy



Kingsley Idehen	      
President & CEO 
OpenLink Software     
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen 
Received on Thursday, 15 April 2010 14:15:03 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:16:05 UTC