RE: ANN: DBpedia version 2016-10 released from Svensson, Lars on 2017-07-06 (public-lod@w3.org from July 2017)

From: Svensson, Lars <L.Svensson@dnb.de>
Date: Thu, 6 Jul 2017 07:24:43 +0000
To: Markus Freudenberg <markus.freudenberg@gmail.com>
CC: "public-lod@w3.org" <public-lod@w3.org>
Message-ID: <24637769D123E644A105A0AF0E1F92EF010D2E5F88@dnbf-ex1.AD.DDB.DE>

Hello Markus,

On Tuesday, July 04, 2017 10:33 PM, Markus Freudenberg [mailto:markus.freudenberg@gmail.com] wrote:

> This release took us longer than expected. We had to deal with multiple issues and
> included new data. Most notable is the addition of the NIF annotation datasets for each
> language, recording the whole wiki text, its basic structure (sections, titles, paragraphs,
> etc.) and the included text links. We hope that researchers and developers, working on
> NLP-related tasks, will find this addition most rewarding. The DBpedia Open Text
> Extraction Challenge (next deadline Mon 17 July for SEMANTiCS 2017) was introduced
> to instigate new fact extraction based on these datasets.
> We want to thank anyone who has contributed to this release, by adding mappings,
> new datasets, extractors or issue reports, helping us to increase coverage and
> correctness of the released data.  The European Commission and the ALIGNED H2020
> project for funding and general support.
> This release is based on updated Wikipedia dumps dating from October 2016.
> You can download the new DBpedia datasets in N3 / TURTLE serialisation from
> http://wiki.dbpedia.org/downloads-2016-10 or directly here
> http://downloads.dbpedia.org/2016-10/.


Impressive work, thanks for making this available!

Some minor questions and comments:
[..]
> • We added a new extractor for citation data that provides two files:
>   • citation links: linking resources to citations
>   • citation data: trying to get additional data from citations. This is a quite interesting
> dataset but we need help to clean it up

These are really interesting data! Is there a chance you can provide those files not only for the English Wikipedia but also for other languages (e. g. German)?

[...] 
> Credits to
[...]
> • Ruben Verborgh from Ghent University – imec for publishing the dataset as Triple
> Pattern Fragments, and imec for sponsoring DBpedia’s Triple Pattern Fragments
> server.

Do you see a possibility to publish the new 2016-10 dataset on http://fragments.dbpedia.org/, too? Or even better, to implement a TPF server with live data similar to the SPARQL endpoint.

Thanks,

Lars


*** Lesen. Hören. Wissen. Deutsche Nationalbibliothek *** 
-- 
Dr. Lars G. Svensson
Deutsche Nationalbibliothek
Informationsinfrastruktur
Adickesallee 1
60322 Frankfurt am Main
Telefon: +49 69 1525-1752
Telefax: +49 69 1525-1799
mailto:l.svensson@dnb.de 
http://www.dnb.de

Received on Thursday, 6 July 2017 07:25:19 UTC