W3C home > Mailing lists > Public > public-lod@w3.org > April 2010

Re: DBpedia hosting burden

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Thu, 15 Apr 2010 10:32:13 -0400
Message-ID: <4BC7236D.5060005@openlinksw.com>
To: public-lod@w3.org
CC: dbpedia-discussion <dbpedia-discussion@lists.sourceforge.net>
Kingsley Idehen wrote:
> Andy Seaborne wrote:
>>
>>
>> On 15/04/2010 2:44 PM, Kingsley Idehen wrote:
>>> Andy,
>>>
>>> Great stuff, this is also why we are going to leave the current DBpedia
>>> 3.5 instance to stew for a while (until end of this week or a little
>>> later).
>>>
>>> DBpedia users:
>>> Now is the time to identify problems with the DBpedia 3.5 dataset 
>>> dumps.
>>> We don't want to continue reloading DBpedia (Static Edition and then
>>> recalibrating DBpedia-Live) based on faulty datasets related 
>>> matters, we
>>> do have other operational priorities etc..
>>
>> "Faulty" is a bit strong.
>
> Imperfect then, however subjective that might be :-)
>>
>> Many of the warnings are legal RDF, but bad lexical forms for the 
>> datatype, or IRIs that trigger some of the standard warnings (but 
>> they are still legal IRIs).  Should they be included or not? Seems to 
>> me you can argue both for and against.
>>
>> external_links_en.nt.bz2  is the largest source of broken IRIs.
>>
>> DBpedia is a wonderful and important dataset, and being derived from 
>> elsewhere is unlikely to ever be "perfect" (for some definition of 
>> "perfect").  Better to have the data than to wait for perfection.
> That's been the approach thus far.
>


Actually meant to say:


Anyway, as I said, we have a window of opportunity to identify current 
issues prior to performing a 3.5.1 reload. ** I jwant to reduce the 
reload cycles due to other items on our todo etc..  ***

:-)

-- 

Regards,

Kingsley Idehen	      
President & CEO 
OpenLink Software     
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen 
Received on Thursday, 15 April 2010 14:32:42 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:26 UTC