W3C home > Mailing lists > Public > public-lod@w3.org > April 2010

Re: Please report bugs to be fixed for the DBpedia 3.5.1 release

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Thu, 15 Apr 2010 10:56:50 -0400
Message-ID: <4BC72932.8020109@openlinksw.com>
To: Chris Bizer <chris@bizer.de>
CC: public-lod@w3.org, 'dbpedia-discussion' <dbpedia-discussion@lists.sourceforge.net>, 'Andy Seaborne' <andy.seaborne@talis.com>
Chris Bizer wrote:
> Hi all,
>
>   
>> Great stuff, this is also why we are going to leave the current DBpedia
>> 3.5 instance to stew for a while (until end of this week or a little later).
>>
>> DBpedia users:
>> Now is the time to identify problems with the DBpedia 3.5 dataset dumps.
>> We don't want to continue reloading DBpedia (Static Edition and then
>> recalibrating DBpedia-Live) based on faulty datasets related matters, we
>> do have other operational priorities etc..
>>     
>
> Yes, the testing by the community has exposed enough small and medium bugs in the datasets so that we are going to extract a new fixed 3.5.1. release next week.
>
> I'm my opinion the bugs do not impair Robert's and Anja's great achievement of porting the extraction framework from PHP to Scala. 

Oh! Certainly not!

That is a major contribution etc..
> If you rewrite more than 10.000 lines of code for something as complex as a multilingual Wikipedia extraction, I think it is normal that some minor bugs remain even after their tough testing.
>   

Of course.
> So, if you have discovered additional bugs and want them fixed.
>
> Please report them to the DBpedia bug tracker until Friday EOB.
>
> http://sourceforge.net/tracker/?group_id=190976
>   

Yes, and then we can schedule a reload such that 3.5.1 is live come 
Monday (maybe even earlier).


Kingsley
>
> Cheers,
>
> Chris
>  
>
>   
>> -----Ursprüngliche Nachricht-----
>> Von: public-lod-request@w3.org [mailto:public-lod-request@w3.org] Im Auftrag
>> von Kingsley Idehen
>> Gesendet: Donnerstag, 15. April 2010 15:44
>> An: Andy Seaborne
>> Cc: public-lod@w3.org; dbpedia-discussion
>> Betreff: Re: DBpedia hosting burden
>>
>> Andy Seaborne wrote:
>>     
>>> I ran the files from
>>> http://www.openjena.org/~afs/DBPedia35-parse-log-2010-04-15.txt
>>> through an N-Triples parser with checking:
>>>
>>> The report is here (it's 25K lines long):
>>>
>>> http://www.openjena.org/~afs/DBPedia35-parse-log-2010-04-15.txt
>>>
>>> It covers both strict errors and warnings of ill-advised forms.
>>>
>>> A few examples:
>>>
>>> Bad IRI: <=?(''[[Nepenthes>
>>> Bad IRI: <http://www.european-athletics.orgâ€>
>>>
>>> Bad lexical forms for the value space:
>>> "1967-02-31"^^http://www.w3.org/2001/XMLSchema#date
>>> (there is no February the 31st)
>>>
>>>
>>> Warning of well known ports of other protocols:
>>> http://stream1.securenetsystems.net:443
>>>
>>> Warning about explicit about port 80:
>>>
>>> http://bibliotecadigitalhispanica.bne.es:80/
>>>
>>> and use of . and .. in absolute URIs which are all from the standard
>>> list of IRI warnings.
>>>
>>> Bad IRI: <http://dbpedia.org/resource/..> Code:
>>> 8/NON_INITIAL_DOT_SEGMENT in PATH: The path contains a segment /../
>>> not at the beginning of a relative reference, or it contains a /./
>>> These should be removed.
>>>
>>>     Andy
>>>
>>> Software used:
>>>
>>> The IRI checker, by Jeremy Carroll, is available from
>>> http://www.openjena.org/iri/ and Maven.
>>>
>>> The lexical form checking is done by Apache Xerces.
>>>
>>> The N-triples parser is the one from TDB v0.8.5 which bundles the
>>> above two together.
>>>
>>>
>>> On 15/04/2010 9:54 AM, Malte Kiesel wrote:
>>>       
>>>> Ivan Mikhailov wrote:
>>>>
>>>>         
>>>>> If I were The Emperor of LOD I'd ask all grand dukes of datasources to
>>>>> put fresh dumps at some torrent with control of UL/DL ratio :)
>>>>>           
>>>> Last time I checked (which was quite a while ago though), loading
>>>> DBpedia in a normal triple store such as Jena TDB didn't work very well
>>>> due to many issues with the DBpedia RDF (e.g., problems with the URIs of
>>>> external links scraped from Wikipedia).
>>>>
>>>> I don't know whether this is a bug in TDB or DBpedia but I guess this is
>>>> one of the problems causing people to use DBpedia online only - even if,
>>>> due to performance reasons, running it locally would be far better.
>>>>
>>>> Regards
>>>> Malte
>>>>
>>>>         
>>>       
>> Andy,
>>
>> Great stuff, this is also why we are going to leave the current DBpedia
>> 3.5 instance to stew for a while (until end of this week or a little later).
>>
>> DBpedia users:
>> Now is the time to identify problems with the DBpedia 3.5 dataset dumps.
>> We don't want to continue reloading DBpedia (Static Edition and then
>> recalibrating DBpedia-Live) based on faulty datasets related matters, we
>> do have other operational priorities etc..
>>
>>
>> --
>>
>> Regards,
>>
>> Kingsley Idehen
>> President & CEO
>> OpenLink Software
>> Web: http://www.openlinksw.com
>> Weblog: http://www.openlinksw.com/blog/~kidehen
>> Twitter/Identi.ca: kidehen
>>
>>
>>
>>
>>     
>
>
>
>   


-- 

Regards,

Kingsley Idehen	      
President & CEO 
OpenLink Software     
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen 
Received on Thursday, 15 April 2010 14:57:19 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:26 UTC