W3C home > Mailing lists > Public > semantic-web@w3.org > April 2016

Re: ANN: DBpedia Version 2015-10 released

From: Dimitris Kontokostas <jimkont@gmail.com>
Date: Tue, 5 Apr 2016 15:23:18 +0300
Message-ID: <CA+u4+a2LT46KXZrHw3AMQR9WL+uRWjm4KDZ391bktanVh60mXw@mail.gmail.com>
To: Michael Brunnbauer <brunni@netestate.de>
Cc: "semantic-web@w3.org" <semantic-web@w3.org>
Hi Michael,

On Tue, Apr 5, 2016 at 3:08 PM, Michael Brunnbauer <brunni@netestate.de>

> Hello Dimitris,
> I got DBpedia 2015-04 from
>  http://downloads.dbpedia.org/2015-04/core/
> It seems that
>  http://downloads.dbpedia.org/2015-10/core/
> also contains a reasonable and current subset of DBpedia. Is this correct?

Yes, there is always a big overlap with the previous releases.
The changes between the previous releases are the wikipedia changes we
capture and some post-processing / quality checks we introduce

> What is the difference to
>  http://downloads.dbpedia.org/2015-10/core-i18n/en/

core is the data we host in the public sparql / LDF endpoints .
we additionally include textual data (labels, comments) in other languages
and external links
we exclude some datasets from core-18n/en that (we think) are not useful to
all users (e.g. page-links)

> As a result of your switch to ttl, there is now a mixture of .nt and .ttl
> files in
>  http://downloads.dbpedia.org/2015-10/core/

you are right, the external links are still in nt format, we will make all
formats uniform in the next release

> but the .ttl files do not seem to use full Turtle syntax at first glance.

yes we do not use abbreviations and have one full triple per line (nt-like)

> Can they be parsed as N-Triple? I ask because I usually had to fix some
> syntax errors before Jena would parse your N-Triples files.

this version should have these syntax errors fixed, let us know if there is
still a problem.


Kontokostas Dimitris
Received on Tuesday, 5 April 2016 12:24:11 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 5 April 2016 12:24:16 UTC