W3C home > Mailing lists > Public > public-lod@w3.org > January 2011

AW: ANN: DBpedia 3.6 released

From: Chris Bizer <chris@bizer.de>
Date: Wed, 19 Jan 2011 16:58:20 +0100
To: "'Antoine Zimmermann'" <antoine.zimmermann@insa-lyon.fr>
Cc: <dbpedia-announcements@lists.sourceforge.net>, <dbpedia-discussion@lists.sourceforge.net>, "'Semantic Web'" <semantic-web@w3.org>, "'public-lod'" <public-lod@w3.org>
Message-ID: <00fc01cbb7f1$b1cffb70$156ff250$@bizer.de>
Hi Antoine,

> I was wondering: do you keep older versions of the DBpedia datasets?

we do and they are all reachable from the DBpedia download page at

http://wiki.dbpedia.org/Downloads36

Just click on the "Older Versions" links.

Cheers,

Chris


> -----Ursprüngliche Nachricht-----
> Von: public-lod-request@w3.org [mailto:public-lod-request@w3.org] Im
> Auftrag von Antoine Zimmermann
> Gesendet: Mittwoch, 19. Januar 2011 16:49
> An: Chris Bizer
> Cc: dbpedia-announcements@lists.sourceforge.net; dbpedia-
> discussion@lists.sourceforge.net; 'Semantic Web'; 'public-lod'
> Betreff: Re: ANN: DBpedia 3.6 released
> 
> Dear Chris and the DBpedia crew,
> 
> 
> As always, a new version of DBpedia is very good news for the Semantic
> Web and Linked Data.
> 
> I was wondering: do you keep older versions of the DBpedia datasets?
> If yes, would you allow people to download older versions for research
> purposes?
> 
> This would be very useful in order to study the dynamics of RDF data, or
> the dynamics of DBpedia itself. There are already papers on the dynamics
> of Wikipedia but I am not aware of corresponding work for DBPedia.
> 
> 
> Regards,
> AZ.
> 
> Le 17/01/2011 14:10, Chris Bizer a écrit :
>  > Hi all,
>  >
>  > we are happy to announce the release of DBpedia 3.6. The new release is
>  > based on Wikipedia dumps dating from October/November 2010.
>  >
>  > The new DBpedia dataset describes more than 3.5 million things, of
which
>  > 1.67 million are classified in a consistent ontology, including 364,000
>  > persons, 462,000 places, 99,000 music albums, 54,000 films, 16,500
video
>  > games, 148,000 organizations, 148,000 species and 5,200 diseases.
>  >
>  > The DBpedia dataset features labels and abstracts for 3.5 million
> things in
>  > up to 97 different languages; 1,850,000 links to images and 5,900,000
> links
>  > to external web pages; 6,500,000 external links into other RDF
> datasets, and
>  > 632,000 Wikipedia categories.
>  >
>  > The dataset consists of 672 million pieces of information (RDF
> triples) out
>  > of which 286 million were extracted from the English edition of
Wikipedia
>  > and 386 million were extracted from other language editions and links
to
>  > external datasets.
>  >
>  > Along with the release of the new datasets, we are happy to announce
the
>  > initial release of the DBpedia MappingTool
>  > (http://mappings.dbpedia.org/index.php/MappingTool): a graphical user
>  > interface to support the community in creating and editing mappings
> as well
>  > as the ontology.
>  >
>  > The new release provides the following improvements and changes
> compared to
>  > the DBpedia 3.5.1 release:
>  >
>  > 1. Improved DBpedia Ontology as well as improved Infobox mappings
> using
>  > http://mappings.dbpedia.org/.
>  >
>  > Furthermore, there are now also mappings in languages other than
> English.
>  > These improvements are largely due to collective work by the community.
>  > There are 13.8 million RDF statements based on mappings (11.1 million
in
>  > version 3.5.1). All this data is in the /ontology/ namespace. Note
> that this
>  > data is of much higher quality than the Raw Infobox data in the
> /property/
>  > namespace.
>  >
>  > Statistics of the mappings wiki on the date of release 3.6:
>  >
>  > + Mappings:
>  >       + English: 315 Infobox mappings (covers 1124 templates including
>  > redirects)
>  >       + Greek: 137 Infobox mappings (covers 192 templates including
>  > redirects)
>  >       + Hungarian: 111 Infobox mappings (covers 151 templates including
>  > redirects)
>  >       + Croatian: 36 Infobox mappings (covers 67 templates including
>  > redirects)
>  >       + German: 9 Infobox mappings
>  >       + Slovenian: 4 Infobox mappings
>  > + Ontology:
>  >       +  272 classes
>  > +  Properties:
>  >       + 629 object properties
>  >       + 706 datatype properties (they are all in the /datatype/
> namespace)
>  >
>  > 2.  Some commonly used property names changed.
>  >
>  > + Please see http://dbpedia.org/ChangeLog and
>  > http://dbpedia.org/Datasets/Properties to know which relations
> changed and
>  > update your applications accordingly!
>  >
>  > 3. New Datatypes for increased quality in mapping-based properties
>  >
>  > + xsd:positiveInteger, xsd:nonNegativeInteger, xsd:nonPositiveInteger,
>  > xsd:negativeInteger
>  >
>  > 4. Improved parsing coverage.
>  >
>  > + Parsing of lists of elements in Infobox property values that
> improves the
>  > completeness of extracted facts.
>  > + Method to deal with missing repeated links in Infoboxes that do
appear
>  > somewhere else on the page.
>  > + Flag templates are parsed.
>  > + Various improvements on internationalization.
>  >
>  > 5. Improved recognition of
>  >
>  > + Wikipedia namespace identifiers.
>  > + Wikipedia language codes.
>  > + Category hierarchies.
>  >
>  > 6. Disambiguation links for acronyms (all upper-case title) are now
>  > extracted (for example, Kilobyte and Knowledge_base for "KB"):
>  >
>  > + Wikilinks consisting of multiple words: If the starting letters of
the
>  > words appear in correct order (with possible gaps) and cover all
acronym
>  > letters.
>  > + Wikilinks consisting of a single word: If the case-insensitive
longest
>  > common subsequence with the acronym is equal to the acronym.
>  >
>  > 7. Encoding (bugfixes):
>  >
>  > + The new datasets support the complete range of Unicode code points
> (up to
>  > 0x10ffff). 16-bit code points start with '\u', code points larger than
>  > 16-bits start with '\U'.
>  > + Commas and ampersands do not get encoded anymore in URIs. Please
> see
>  > http://dbpedia.org/URIencoding for an explanation regarding the
> DBpedia URI
>  > encoding scheme.
>  >
>  > 8. Extended Datasets:
>  >
>  > + Thanks to Johannes Hoffart (Max-Planck-Institut für Informatik) for
>  > contributing links to YAGO2.
>  > + Freebase links have been updated. They now refer to mids
>  > (http://wiki.freebase.com/wiki/Machine_ID) because guids have been
>  > deprecated.
>  >
>  > You can download the new DBpedia dataset from
> http://dbpedia.org/Downloads36
>  >
>  > As usual, the dataset is also available as Linked Data and via the
> DBpedia
>  > SPARQL endpoint at http://dbpedia.org/sparql
>  >
>  > Lots of thanks to:
>  >
>  > + All editors that contributed to the DBpedia ontology mappings via the
>  > Mappings Wiki.
>  > + Max Jakob (Freie Universität Berlin, Germany) for improving the
DBpedia
>  > extraction framework and for extracting the new datasets.
>  > + Robert Isele and Anja Jentzsch (both Freie Universität Berlin,
Germany)
>  > for helping Max with their expertise on the extraction framework.
>  > + Paul Kreis (Freie Universität Berlin, Germany) for analyzing the
> DBpedia
>  > data of the previous release and suggesting ways to increase quality
and
>  > quantity. Some results of his work were implemented in this release.
>  > + Dimitris Kontokostas (Aristotle University of Thessaloniki,
> Greece), Jimmy
>  > O'Regan (Eolaistriu Technologies, Ireland), José Paulo Leal
> (University of
>  > Porto, Portugal) for providing patches to improve the extraction
> framework.
>  > + Jens Lehmann and Sören Auer (both Universität Leipzig, Germany) for
>  > providing the new dataset via the DBpedia download server at
Universität
>  > Leipzig.
>  > + Kingsley Idehen and Mitko Iliev (both OpenLink Software) for
> loading the
>  > dataset into the Virtuoso instance that serves the Linked Data view and
>  > SPARQL endpoint. OpenLink Software (http://www.openlinksw.com/)
> altogether
>  > for providing the server infrastructure for DBpedia.
>  >
>  > The work on the new release was financially supported by:
>  >
>  > + Neofonie GmbH, a Berlin-based company offering leading technologies
> in the
>  > area of Web search, social media and mobile applications
>  > (http://www.neofonie.de/).
>  > + The European Commission through the project LOD2 - Creating
> Knowledge out
>  > of Linked Data (http://lod2.eu/).
>  > + Vulcan Inc. as part of its Project Halo
(http://www.projecthalo.com/).
>  > Vulcan Inc. creates and advances a variety of world-class endeavors
> and high
>  > impact initiatives that change and improve the way we live, learn, do
>  > business (http://www.vulcan.com/).
>  >
>  > More information about DBpedia is found at http://dbpedia.org/About
>  >
>  > Have fun with the new dataset!
>  >
>  > The whole DBpedia team also congratulates Wikipedia to its 10th
Birthday
>  > which was this weekend!
>  >
>  > Cheers,
>  >
>  > Chris Bizer
>  >
>  >
>  > --
>  > Prof. Dr. Christian Bizer
>  > Web-based Systems Group
>  > Freie Universität Berlin
>  > +49 30 838 55509
>  > http://www.bizer.de
>  > chris@bizer.de
>  >
>  >
>  >
> 
> 
> --
> Antoine Zimmermann
> Researcher at:
> Laboratoire d'InfoRmatique en Image et Systèmes d'information
> Database Group
> 7 Avenue Jean Capelle
> 69621 Villeurbanne Cedex
> France
> Lecturer at:
> Institut National des Sciences Appliquées de Lyon
> 20 Avenue Albert Einstein
> 69621 Villeurbanne Cedex
> France
> antoine.zimmermann@insa-lyon.fr
> http://zimmer.aprilfoolsreview.com/
Received on Wednesday, 19 January 2011 15:57:48 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:31 UTC