Re: [Dbpedia-discussion] ANN: DBpedia 3.9 released, including wider infobox coverage, additional type statements, and new YAGO and Wikidata links

Hi Christian,

thank you for the notice! :)

For what regards your work about calculating missing types, I am noticing
you are including in your datasets also resources which are indeed pure
redirects.
While I understand this comes directly from how the Wikipedia pages are
linked together, I am wondering how not resolving redirects impact on
probability calculations and the overall precision/recall of the algorithm.
Do you have any thoughts to share?

Cheers
Andrea


2013/9/23 Christian Bizer <chris@bizer.de>

> Hi all,
>
> we are happy to announce the release of DBpedia 3.9.
>
> The most important improvements of the new release compared to DBpedia 3.8
> are:
>
> 1. the new release is based on updated Wikipedia dumps dating from March /
> April 2013 (the 3.8 release was based on dumps from June 2012), leading to
> an overall increase in the number of concepts in the English edition from
> 3.7 to 4.0 million things.
>
> 2. the DBpedia ontology is enlarged and the number of infobox to ontology
> mappings has risen, leading to richer and cleaner concept descriptions.
>
> 3. we extended the DBpedia type system to also cover Wikipedia articles
> that
> do not contain an infobox.
>
> 4. we provide links pointing from DBpedia concepts to Wikidata concepts and
> updated the links pointing at YAGO concepts and classes, making it easier
> to
> integrate knowledge from these sources.
>
> The English version of the DBpedia knowledge base currently describes 4.0
> million things, out of which 3.22 million are classified in a consistent
> Ontology, including 832,000 persons, 639,000 places (including 427,000
> populated places), 372,000 creative works (including 116,000 music albums,
> 78,000 films and 18,500 video games), 209,000 organizations (including
> 49,000 companies and 45,000 educational institutions), 226,000 species and
> 5,600 diseases.
>
> We provide localized versions of DBpedia in 119 languages. All these
> versions together describe 24.9 million things, out of which 16.8 million
> overlap (are interlinked) with the concepts from the English DBpedia. The
> full DBpedia data set features labels and abstracts for 12.6 million unique
> things in 119 different languages; 24.6 million links to images and 27.6
> million links to external web pages; 45.0 million external links into other
> RDF datasets, 67.0 million links to Wikipedia categories, and 41.2 million
> YAGO categories.
>
> Altogether the DBpedia 3.9 release consists of 2.46 billion pieces of
> information (RDF triples) out of which 470 million were extracted from the
> English edition of Wikipedia, 1.98 billion were extracted from other
> language editions, and about 45 million are links to external data sets.
>
> Detailed statistics about the DBpedia data sets in 24 popular languages are
> provided at http://wiki.dbpedia.org/Datasets39/DatasetStatistics
>
> The main changes between DBpedia 3.8 and 3.9 are described below. For
> additional, more detailed information please refer to the Change Log
> (http://wiki.dbpedia.org/Changelog)
>
>
> 1. Enlarged Ontology
>
> The DBpedia community added new classes and properties to the DBpedia
> ontology via the mappings wiki. The DBpedia 3.9 ontology encompasses
>
> 529 classes (DBpedia 3.8: 359)
> 927 object properties (DBpedia 3.8: 800)
> 1290 datatype properties (DBpedia 3.8: 859)
> 116 specialized datatype properties (DBpedia 3.8: 116)
> 46 owl:equivalentClass and 31 owl:equivalentProperty mappings to
> http://schema.org
>
>
> 2. Additional Infobox to Ontology Mappings
>
> The editors of the mappings wiki also defined many new mappings from
> Wikipedia templates to DBpedia classes. For the DBpedia 3.9 extraction, we
> used 3177 mappings (DBpedia 3.8: 2347 mappings), that are distributed as
> follows over the languages covered in the release.
>
> English: 431 mappings
> Polish: 382 mappings
> Dutch: 335 mappings
> German: 219 mappings
> Greek: 215 mappings
> Portuguese: 211 mappings
> Slovenian: 170 mappings
> French: 165 mappings
> Korean: 148 mappings
> Spanish: 137 mappings
> Hungarian: 111 mappings
> Turkish: 91 mappings
> Japanese: 72 mappings
> Czech: 66 mappings
> Italian: 62 mappings
> Bulgarian: 61 mappings
> Indonesian: 59 mappings
> Catalan: 52 mappings
> Arabic: 51 mappings
> Russian: 48 mappings
> Croatian: 36 mappings
> Basque: 32 mappings
> Irish: 17 mappings
> Bengali: 6 mappings
>
>
> 3. Extended Type System to cover Articles without Infobox
>
> Until the DBpedia 3.8 release, a concept was only assigned a type (like
> person or place) if the corresponding Wikipedia article contains an infobox
> indicating this type. The new 3.9 release now also contains type statements
> for articles without infobox that were inferred based on the link structure
> within the DBpedia knowledge base using the algorithm described in
> Paulheim/Bizer 2013 [1]. Applying the algorithm allowed us to provide type
> information for 440,000 concepts that were formerly not typed. A similar
> algorithm was also used to identify and remove potentially wrong links from
> the knowledge base.
>
>
> 4. New and updated RDF Links into External Data Sources
>
> We added RDF links to Wikidata and updated the following RDF link sets
> pointing at other Linked Data sources: YAGO, Freebase, Geonames, GADM and
> EUNIS. For an overview about all data sets that are interlinked from
> DBpedia
> please refer to http://wiki.dbpedia.org/Interlinking
>
>
> 5. New Find Related Concepts Service
>
> We offer a new service for finding resources that are related to a given
> DBpedia seed resource. More information about the service is found at
> http://wiki.dbpedia.org/FindRelated
>
>
>
> Accessing the DBpedia 3.9  Release:
>
> You can download the new DBpedia datasets from
> http://wiki.dbpedia.org/Downloads39
>
> As usual, the dataset is also available as Linked Data and via the DBpedia
> SPARQL endpoint at http://dbpedia.org/sparql
>
>
> Lots of thanks to:
>
> * Jona Christopher Sahnwaldt (Freelancer funded by the University of
> Mannheim, Germany) for improving the DBpedia extraction framework, for
> extracting the DBpedia 3.9 data sets for all 119 languages, and for
> generating the updated RDF links to external data sets.
> * All editors that contributed to the DBpedia ontology mappings via the
> Mappings Wiki.
> * Heiko Paulheim (University of Mannheim, Germany) for inventing and
> implementing the algorithm to generate additional type statements for
> formerly untyped resources.
> * The whole Internationalization Committee for pushing the DBpedia
> internationalization forward.
> * Dimitris Kontokostas (University of Leipzig) for improving the DBpedia
> extraction framework and loading the new release onto the DBpedia download
> server in Leipzig.
> * Volha Bryl (University of Mannheim, Germany) for generating the
> statistics
> about the new release.
> * Petar Ristoski (University of Mannheim, Germany) for generating the
> updated links pointing at the GADM database of Global Administrative Areas.
> * Kingsley Idehen, Patrick van Kleef, and Mitko Iliev (all OpenLink
> Software) for loading the new data set into the Virtuoso instance that
> serves the Linked Data view and SPARQL endpoint.
> * OpenLink Software (http://www.openlinksw.com/) altogether for providing
> the server infrastructure for DBpedia.
> * Julien Cojan, Andrea Di Menna, Ahmed Ktob, Julien Plu, Jim Regan and
> others who contributed improvements to the DBpedia extraction framework via
> the source code repository on GitHub.
>
> The work on the DBpedia 3.9 release was financially supported by the
> European Commission through the project LOD2 - Creating Knowledge out of
> Linked Data (http://lod2.eu/).
>
>
> More information about DBpedia is found at http://dbpedia.org/About as
> well
> as in the new overview article [2] about the project.
>
> Have fun with the new DBpedia release!
>
> Cheers,
>
> Christian Bizer and Christopher Sahnwaldt
>
>
>
> [1] http://www.heikopaulheim.com/docs/iswc2013.pdf
> [2] http://svn.aksw.org/papers/2013/SWJ_DBpedia/public.pdf
>
>
>
>
> ------------------------------------------------------------------------------
> LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99!
> 1,500+ hours of tutorials including VisualStudio 2012, Windows 8,
> SharePoint
> 2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack
> includes
> Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13.
> http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk
> _______________________________________________
> Dbpedia-discussion mailing list
> Dbpedia-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>

Received on Monday, 23 September 2013 12:32:22 UTC