- From: Dimitris Kontokostas <kontokostas@informatik.uni-leipzig.de>
- Date: Tue, 7 Jun 2016 12:51:32 +0300
- To: "DBpedia Discussion (ML)" <dbpedia-discussion@lists.sourceforge.net>, "DBpedia Developers (ML)" <dbpedia-developers@lists.sourceforge.net>, Linked Data community <public-lod@w3.org>, "Discussion list for the Wikidata project." <wikidata@lists.wikimedia.org>
- Message-ID: <CA+u4+a1Xpi5JUKWUF7eP27WaUPS=ONK2gmaD7gg_b+8kbyZWBQ@mail.gmail.com>
In the latest release (2015-10) DBpedia started exploring the citation and reference data from Wikipedia and we were pleasantly surprised by the rich data <http://downloads.dbpedia.org/preview.php?file=2015-10_sl_core-i18n_sl_en_sl_citation_data_en.ttl.bz2> we managed to extract. - citation_data_en.ttl.bz2 <http://downloads.dbpedia.org/2015-10/core-i18n/en/citation_data_en.ttl.bz2> (sample <http://downloads.dbpedia.org/preview.php?file=2015-10_sl_core-i18n_sl_en_sl_citation_data_en.ttl.bz2> ) - citation_links_en.ttl.bz2 <http://downloads.dbpedia.org/2015-10/core-i18n/en/citation_links_en.ttl.bz2> (sample <http://downloads.dbpedia.org/preview.php?file=2015-10_sl_core-i18n_sl_en_sl_citation_links_en.ttl.bz2> ) This data holds huge potential, especially for the Wikidata challenge of providing a reference source for every statement. It describes not only a lot of bibliographical data, but also a lot of web pages and many other sources around the web. The data we extract at the moment is quite raw and can be improved in many different ways. Some of the potential improvements are: - Extend the citation extractor to handle other Wikipedia language editions <https://github.com/dbpedia/extraction-framework/issues/451>; currently only English Wikipedia is supported. - Map the data to a relevant Bibliographic ontology <https://github.com/dbpedia/mappings-tracker/issues/79> (there are many candidates and, although BIBO got most votes, we are open to other ontologies) - Map the data to existing Bibliographic LOD (eg TEL has 100M records, Worldcat 300M) or online books (eg Google Books). See the citationIri issue <https://github.com/dbpedia/extraction-framework/issues/452>. - Ways to merge / fuse identical citations from multiple articles - Use the citation data in the Wikidata primary sources tool <https://www.wikidata.org/wiki/Wikidata:Primary_sources_tool> - Surprise us with your ideas! We welcome contributions that improve the existing citation dataset in any way; and we are open to collaboration and helping. Results will be presented at the next DBpedia meeting: 15 September 2016 in Leipzig, co-located with SEMANTiCS 2016. Each participant should submit a short description of his/her contribution by Monday 12 September 2016 and present his/her work at the meeting. Comments, questions can be posted on the DBpedia discussion & developer lists or in our new DBpedia ideas page <http://wiki.dbpedia.org/ideas/idea/261/dbpedia-citations-reference-challenge/> . Submissions will be judged by the Organizing Committee and the best two will receive a prize. Organizing Committee - Vladimir Alexiev, Ontotext and DBpedia BG - Anastasia Dimou, Ghent University, iMinds - Dimitris Kontokostas, KILT/AKSW, DBpedia Association -- Dimitris Kontokostas Department of Computer Science, University of Leipzig & DBpedia Association Projects: http://dbpedia.org, http://rdfunit.aksw.org, http://aligned-project.eu Homepage: http://aksw.org/DimitrisKontokostas Research Group: AKSW/KILT http://aksw.org/Groups/KILT
Received on Tuesday, 7 June 2016 09:52:27 UTC