- From: Georgi Kobilarov <georgi.kobilarov@gmx.de>
- Date: Mon, 6 Oct 2008 18:34:10 +0200
- To: <dbpedia-discussion@lists.sourceforge.net>, <public-lod@w3.org>, <semantic-web@w3c.org>
Hi all, in the past, DBpedia data extracted from Wikipedia infoboxes always lacked some structure. We had no own ontology describing and structuring our data, no own class or property definitions. That made it quite difficult to query for example for "all people born in Berlin". There were many different rdf predicates for "born in", such as dbpedia:birthplace, dbpedia:placebirth, dbpedia:placeofbirth, etc. And there was no canonical class hierarchy with working inference to query for "all people". I'm glad to announce that we've made an important first step to solving these problems by creating the canonical ontology for DBpedia and mappings for Wikipedia infobox. We've created a flat class hierarchy, mapped Wikipedia templates to DBpedia classes and re-written the infobox extraction code to be configurable on a very granular level. I wrote a blog post with details and some extraction result statistics [1]. A preview of the class hierarchy is here [2] (a fully browsable version will follow soon). The new infobox dataset is available at [3], the according dataset with rdf:type statements at [4]. That data will be available soon in our DBpedia sparql endpoint. I'll post some demo queries and make the ontology available as rdfs as well. Until that, have a look at the new data and let us know your thoughts. Many thanks to Anja Jentzsch for her great help on building the ontology. Any comments are highly appreciated. Cheers, Georgi [1] http://blog.georgikobilarov.com/2008/10/dbpedia-rethinking-wikipedia-inf obox-extraction/ [2] http://www4.wiwiss.fu-berlin.de/dbpedia/georgi/dataset/stats.htm [3] http://www4.wiwiss.fu-berlin.de/dbpedia/dev/infobox/infobox.zip [4] http://www4.wiwiss.fu-berlin.de/dbpedia/dev/infobox/types.zip -- Georgi Kobilarov Freie Universität Berlin www.georgikobilarov.com
Received on Monday, 6 October 2008 16:34:10 UTC