- From: Kingsley Idehen <kidehen@openlinksw.com>
- Date: Thu, 15 Apr 2010 10:14:34 -0400
- To: Andy Seaborne <andy.seaborne@talis.com>
- CC: public-lod@w3.org, dbpedia-discussion <dbpedia-discussion@lists.sourceforge.net>
Andy Seaborne wrote: > > > On 15/04/2010 2:44 PM, Kingsley Idehen wrote: >> Andy, >> >> Great stuff, this is also why we are going to leave the current DBpedia >> 3.5 instance to stew for a while (until end of this week or a little >> later). >> >> DBpedia users: >> Now is the time to identify problems with the DBpedia 3.5 dataset dumps. >> We don't want to continue reloading DBpedia (Static Edition and then >> recalibrating DBpedia-Live) based on faulty datasets related matters, we >> do have other operational priorities etc.. > > "Faulty" is a bit strong. Imperfect then, however subjective that might be :-) > > Many of the warnings are legal RDF, but bad lexical forms for the > datatype, or IRIs that trigger some of the standard warnings (but they > are still legal IRIs). Should they be included or not? Seems to me > you can argue both for and against. > > external_links_en.nt.bz2 is the largest source of broken IRIs. > > DBpedia is a wonderful and important dataset, and being derived from > elsewhere is unlikely to ever be "perfect" (for some definition of > "perfect"). Better to have the data than to wait for perfection. That's been the approach thus far. Anyway, as I said, we have a window of opportunity to identify current issues prior to performing a 3.5.1 reload. I just don't want to reduce the reload cycles due to other items on our todo etc.. > > Andy > -- Regards, Kingsley Idehen President & CEO OpenLink Software Web: http://www.openlinksw.com Weblog: http://www.openlinksw.com/blog/~kidehen Twitter/Identi.ca: kidehen
Received on Thursday, 15 April 2010 14:15:03 UTC