- From: Kingsley Idehen <kidehen@openlinksw.com>
- Date: Sat, 15 Oct 2016 11:43:48 -0400
- To: "public-lod@w3.org" <public-lod@w3.org>
- Cc: 'W3C Web Schemas Task Force' <public-vocabs@w3.org>, business-of-linked-data-bold <business-of-linked-data-bold@googlegroups.com>, Virtuoso-users <Virtuoso-users@lists.sourceforge.net>, bio2rdf <bio2rdf@googlegroups.com>
- Message-ID: <a67cfc5f-ef04-d387-0a5a-9a1996d46dc1@openlinksw.com>
FYI On 10/15/16 9:14 AM, Markus Freudenberg wrote: > > Hereby we announce the release of DBpedia 2016-04. The new release is > based on updated Wikipedia dumps dating from March/April 2016 > featuring a significantly expanded base of information as well as > richer and (hopefully) cleaner data based on the DBpedia ontology. > > > You can download the new DBpedia datasets in a variety of RDF-document > formats from: http://wiki.dbpedia.org/downloads-2016-04or directly > here: http://downloads.dbpedia.org/2016-04/ > > > Support DBpedia > > During the latest DBpedia meeting in Leipzig we discussed about ways > to support DBpedia <http://blog.dbpedia.org/?p=210>and what benefits > this support would bring > <http://wiki.dbpedia.org/why-is-dbpedia-so-important>. For the next > two months, we are aiming to raise money to support the hosting of the > main services and the next DBpedia release (especially to shorten > release intervals). On top of that we need to buy a new server to host > DBpedia Spotlight that was so generously hosted so far by third > parties. If you use DBpedia and want us to keep going forward, we > kindly invite you to donate here <http://wiki.dbpedia.org/donate>or > become a member of the DBpedia association > <http://wiki.dbpedia.org/membership>. > > > Statistics > > The English version of the DBpedia knowledge base currently describes > 6.0M entities of which 4.6M have abstracts, 1.53M have geo coordinates > and 1.6M depictions. In total, 5.2M resources are classified in a > consistent ontology, consisting of 1.5M persons, 810K places > (including 505K populated places), 490K works (including 135K music > albums, 106K films and 20K video games), 275K organizations (including > 67K companies and 53K educational institutions), 301K species and 5K > diseases. The total number of resources in English DBpedia is 16.9M > that, besides the 6.0M resources, includes 1.7M skos concepts > (categories), 7.3M redirect pages, 260K disambiguation pages and 1.7M > intermediate nodes. > > > Altogether the DBpedia 2016-04 release consists of 9.5 billion > (2015-10: 8.8 billion) pieces of information (RDF triples) out of > which 1.3 billion (2015-10: 1.1 billion) were extracted from the > English edition of Wikipedia, 5.0 billion (2015-04: 4.4 billion) were > extracted from other language editions and 3.2 billion (2015-10: 3.2 > billion) from DBpedia Commons and Wikidata. In general, we observed a > growth in mapping-based statements of about 2%. > > > Thorough statistics can be found on theDBpedia website > <http://wiki.dbpedia.org/dbpedia-2016-04-statisticsdatasets/dataset-2015-10/dataset-2015-10-statistics>and > general information on the DBpedia datasetshere > <http://wiki.dbpedia.org/services-resources/datasets/dbpedia-datasets>. > > > Community > > The DBpedia community added new classes and properties to the DBpedia > ontology via the mappings wiki. The DBpedia 2016-04 ontology encompasses: > > * > > 754 classes (DBpedia 2015-10: 739) > > * > > 1,103 object properties (DBpedia 2015-10: 1,099) > > * > > 1,608 datatype properties (DBpedia 2015-10: 1,596) > > * > > 132 specialized datatype properties (DBpedia 2015-10: 132) > > * > > 410 owl:equivalentClass and 221 owl:equivalentProperty mappings > external vocabularies (DBpedia 2015-04: 407 - 221) > > > The editor community of the mappings wiki also defined many new > mappings from Wikipedia templates to DBpedia classes. For the DBpedia > 2016-04 extraction, we used a total of 5800 template mappings (DBpedia > 2015-10: 5553 mappings). For the second time the top language, gauged > by the number of mappings, is Dutch (646 mappings), followed by the > English community (604 mappings). > > > (Breaking) Changes > > * > > In addition to normalized datasets to English DBpedia (en-uris) we > additionally provide normalized datasets based on the DBpedia > Wikidata (DBw) datasets (wkd-uris). These sorted datasets will be > the foundation for the upcoming fusion process with wikidata. The > DBw-based uris will be the only ones provided from the following > releases on. > > * > > We now filter out triples from the Raw Infobox Extractor that are > already mapped. E.g. no more “<x> dbo:birthPlace <z>” and “<x> > dbp:birthPlace|dbp:placeOfBirth|... <z>” in the same resource. > These triples are now moved to the “infobox-properties-mapped” > datasets and not loaded on the main endpoint. See issue 22 > <https://github.com/dbpedia/extraction-framework/issues/22>for > more details. > > * > > Major improvements in our citation extraction. See here > <http://www.mail-archive.com/dbpedia-discussion@lists.sourceforge.net/msg07762.html>for > more details. > > * > > We incorporated the statistical distribution approach > <http://www.heikopaulheim.com/docs/iswc2013.pdf>of Heiko Paulheim > in creating type statements automatically and providing them as an > additional datasets (instance_types_sdtyped_dbo). > > > In case you missed it, what we changed in the previous release (2015-10) > > * > > English DBpedia switched to IRIs. This can be a breaking change to > some applications that need to change their stored DBpedia > resource URIs / links. We provide the “uri-same-as-iri” dataset > for English to ease the transition. > > * > > The instance-types dataset is now split into two files: > instance-types (containing only direct types) and > instance-types-transitive containing the transitive types of a > resource based on the DBpedia ontology > > * > > The mappingbased-properties file is now split into three (3) files: > > o > > “geo-coordinates-mappingbased” that contains the coordinated > originating from the mappings wiki. the “geo-coordinates” > continues to provide the coordinates originating from the > GeoExtractor > > o > > “mappingbased-literals” that contains mapping based fact with > literal values > > o > > “mappingbased-objects” that contains mapping based fact with > object values > > o > > the “mappingbased-objects-disjoint-[domain|range]” are facts > that are filtered out from the “mappingbased-objects” datasets > as errors but are still provided > > * > > We added a new extractor for citation data that provides two files: > > o > > citation links: linking resources to citations > > o > > citation data: trying to get additional data from citations. > This is a quite interesting dataset but we need help to clean > it up > > * > > All datasets are available in .ttl and .tql serialization (nt, nq > dataset were neglected for reasons of redundancy and server capacity). > > > Upcoming Changes > > * > > Dataset normalization: We are going to normalize datasets based on > wikidata uris and no longer on the English language edition, as a > prerequisite to finally start the fusion process with wikidata. > > * > > RML Integration: Wouter Maroy did already provide the necessary > groundwork for switching the mappings wiki to aRML based approach > <https://drive.google.com/file/d/0B7je1jgVmCgISXBPOHc3NDktblU/view?usp=sharing>on > Github. We are not there yet but this is at the top of our list of > changes. > > * > > Starting with the next release we are adding datasets with NIF > annotations > <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core/nif-core.html>of > the abstracts (as we already provided those for the 2015-04 > release > <http://downloads.dbpedia.org/2015-04/ext/nlp/abstracts/>). We > will eventually extend the NIF annotation dataset to cover the > whole Wikipedia article of a resource. > > > New Datasets > > * > > SDTypes:We extended the coverage of the automatically created type > statements (instance_types_sdtyped_dbo) to English, German and > Dutch (see above). > > * > > Extensions:In the extension folder (2016-04/ext > <http://downloads.dbpedia.org/2016-04/ext/>) we provide two new > datasets, both are to be considered in an experimental state: > > o > > DBpedia World Facts:This dataset is authored by the DBpedia > association itself. It lists all countries, all currencies in > use and (most) languages spoken in the world as well as how > these concepts relate to each other (spoken in, primary > language etc.) and useful properties like iso codes (ontology > diagram > <https://raw.githubusercontent.com/dbpedia/WorldFacts/master/DBpediaWorldFactsOntology.png>). > This Dataset extends the very useful LEXVO > <http://www.lexvo.org>dataset with facts from DBpedia and the > CIA Factbook > <https://www.cia.gov/library/publications/the-world-factbook/>. > Please report any error or suggestions in regard to this > dataset to Markus <mailto:markus.freudenberg@gmail.com>. > > o > > Lector Facts:This experimental dataset was provided by Matteo > Cannaviccio and demonstrates his approach > <http://dl.acm.org/citation.cfm?id=2932203>to generating facts > by using common sequences of words (i.e. phrases) that are > frequently used to describe instances of binary relations in a > text. We are looking into using this approach as a regular > extraction step. It would be helpful to get some feedback from > you. > > > > > > Credits > > Lots of thanks to > > * > > Markus Freudenberg (University of Leipzig / DBpedia Association) > for taking over the whole release process and creating the > revamped download & statistics pages. > > * > > Dimitris Kontokostas (University of Leipzig / DBpedia Association) > for conveying his considerable knowledge of the extraction and > release process. > > * > > All editors that contributed to the DBpedia ontology mappings via > the Mappings Wiki. > > * > > The whole DBpedia Internationalization Committee for pushing the > DBpedia internationalization forward. > > * > > Heiko Paulheim (University of Mannheim) for providing the > necessary code for his algorithm to generate additional type > statements for formerly untyped resources and identify and removed > wrong statements. Which is now part of the DIEF. > > * > > Václav Zeman, Thomas Klieger and the whole LHD team (University of > Prague) for their contribution of additional DBpedia types > > * > > Marco Fossati (FBK) for contributing the DBTax types > > * > > Alan Meehan (TCD) for performing a big external link cleanup > > * > > Aldo Gangemi (LIPN University, France & ISTC-CNR, Italy) for > providing the links from DOLCE to DBpedia ontology. > > * > > Kingsley Idehen, Patrick van Kleef, and Mitko Iliev (all OpenLink > Software) for loading the new data set into the Virtuoso instance > that provides 5-Star Linked Open Data publication and SPARQL Query > Services. > > * > > OpenLink Software (http://www.openlinksw.com/) collectively for > providing the SPARQL Query Services and Linked Open Data > publishing infrastructure for DBpedia in addition to their > continuous infrastructure support. > > * > > Ruben Verborgh from Ghent University – iMinds for publishing the > dataset asTriple Pattern Fragments > <http://fragments.dbpedia.org/>, and iMinds for sponsoring > DBpedia’s Triple Pattern Fragments server. > > * > > Ali Ismayilov (University of Bonn) for extending the DBpedia > Wikidata dataset. > > * > > Vladimir Alexiev (Ontotext) for leading a successful mapping and > ontology clean up effort. > > * > > All the GSoC students and mentors which directly or indirectly > influenced the DBpedia release > > * > > Special thanks to members of theDBpedia Association > <http://dbpedia.org/dbpedia-association>, theAKSW > <http://aksw.org/About.html>and the department forBusiness > Information Systems > <http://bis.informatik.uni-leipzig.de/en/Welcome>of the University > of Leipzig. > > > > > The work on the DBpedia 2016-04 release was financially supported by > the European Commission through the project ALIGNED – quality-centric, > software and data engineering (http://aligned-project.eu/). > > More information about DBpedia is found athttp://dbpedia.org > <http://dbpedia.org/>as well as in the new overview article about the > project available athttp://wiki.dbpedia.org/Publications > <http://wiki.dbpedia.org/Publications>. > > Have fun with the new DBpedia 2016-04 release! > > Cheers, > > Markus Freudenberg, Dimitris Kontokostas, Sebastian Hellmann -- Regards, Kingsley Idehen Founder & CEO OpenLink Software (Home Page: http://www.openlinksw.com) Weblogs (Blogs): Legacy Blog: http://www.openlinksw.com/blog/~kidehen/ Blogspot Blog: http://kidehen.blogspot.com Medium Blog: https://medium.com/@kidehen Profile Pages: Pinterest: https://www.pinterest.com/kidehen/ Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen Twitter: https://twitter.com/kidehen Google+: https://plus.google.com/+KingsleyIdehen/about LinkedIn: http://www.linkedin.com/in/kidehen Web Identities (WebID): Personal: http://kingsley.idehen.net/dataspace/person/kidehen#this : http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this
Attachments
- application/pkcs7-signature attachment: S/MIME Cryptographic Signature
Received on Saturday, 15 October 2016 15:44:15 UTC