- From: Lee Feigenbaum <lee@thefigtrees.net>
- Date: Mon, 12 Apr 2010 09:19:43 -0400
- To: Luis Criado Fernández <lcriadof@yahoo.es>
- CC: Chris Bizer <chris@bizer.de>, dbpedia-discussion@lists.sourceforge.net, dbpedia-announcements@lists.sourceforge.net, public-lod@w3.org, SW-forum <semantic-web@w3.org>, semanticweb@yahoogroups.com
Hello Luis, You can accomplish this with SPARQL. Please see http://www.cambridgesemantics.com/2008/09/sparql-by-example/#%2825%29 for an example. You can also download language-specific subsets of the data from the DBPedia downloads page at: http://wiki.dbpedia.org/Downloads35 hope this helps, Lee On 4/12/2010 8:42 AM, Luis Criado Fernández wrote: > A great job, congratulations!!!!. > > I did not know the existence of Dbpedia. I have much interest in studying it. > > If you allow me a question, I would like to know, if > do we have any way to distinguish the language > of the content of the value of property "dbpedia-owl: abstract"? > > Please excuse my English, > > > > > ________________________________ > > Cheers, > Luis Criado > http://lcriadof.blogspot.com/ > > > > > ----- Mensaje original ---- > De: Chris Bizer<chris@bizer.de> > Para: dbpedia-discussion@lists.sourceforge.net; dbpedia-announcements@lists.sourceforge.net > CC: public-lod@w3.org; SW-forum<semantic-web@w3.org>; semanticweb@yahoogroups.com > Enviado: lun,12 abril, 2010 11:06 > Asunto: ANN: DBpedia 3.5 released > > Hi all, > > we are happy to announce the release of DBpedia 3.5. > > The new release is based on Wikipedia dumps dating from March 2010. Compared > to the 3.4 release, we were able to increase the quality of the DBpedia > knowledge base by employing a new data extraction framework which applies > various data cleansing heuristics as well as by extending the > infobox-to-ontology mappings that guide the data extraction process. > > The new DBpedia knowledge base describes more than 3.4 million things, out > of which 1.47 million are classified in a consistent ontology, including > 312,000 persons, 413,000 places, 94,000 music albums, 49,000 films, 15,000 > video games, 140,000 organizations, 146,000 species and 4,600 diseases. The > DBpedia data set features labels and abstracts for these 3.2 million things > in up to 92 different languages; 1,460,000 links to images and 5,543,000 > links to external web pages; 4,887,000 external links into other RDF > datasets, 565,000 Wikipedia categories, and 75,000 YAGO categories. The > DBpedia knowledge base altogether consists of over 1 billion pieces of > information (RDF triples) out of which 257 million were extracted from the > English edition of Wikipedia and 766 million were extracted from other > language editions. > > The new release provides the following improvements and changes compared to > the DBpedia 3.4 release: > > 1. The DBpedia extraction framework has been completely rewritten in Scala. > The new framework dramatically reduces the extraction time of a single > Wikipedia article from over 200 to about 13 milliseconds. All features of > the previous PHP framework have been ported. In addition, the new framework > can extract data from Wikipedia tables based on table-to-ontology mappings > and is able to extract multiple infoboxes out of a single Wikipedia article. > The data from each infobox is represented as a separate RDF resource. All > resources that are extracted from a single page can be connected using > custom RDF properties which are also defined in the mappings. A lot of work > also went into the value parsers and the DBpedia 3.5 dataset should > therefore be much cleaner than its predecessors. In addition, units of > measurement are normalized to their respective SI unit, which makes querying > DBpedia easier. > > 2. The mapping language that is used to map Wikipedia infoboxes to the > DBpedia Ontology has been redesigned. The documentation of the new mapping > language is found at > http://dbpedia.svn.sourceforge.net/viewvc/dbpedia/trunk/extraction/core/doc/ > mapping%20language/ > > 3. In order to enable the DBpedia user community to extend and refine the > infobox to ontology mappings, the mappings can be edited on the newly > created wiki hosted on http://mappings.dbpedia.org. At the moment, 303 > template mappings are defined, which cover (including redirects) 1055 > templates. On the wiki, the DBpedia Ontology can be edited by the community > as well. At the moment, the ontology consists of 259 classes and about 1,200 > properties. > > 4. The ontology properties extracted from infoboxes are now split into two > data sets: 1. The Ontology Infobox Properties dataset contains the > properties as they are defined in the ontology (e.g. length). The range of a > property is either an xsd schema type or a dimension of measurement, in > which case the value is normalized to the respective SI unit. 2. The > Ontology Infobox Properties (Specific) dataset contains properties which > have been specialized for a specific class using a specific unit. e.g. the > property height is specialized on the class Person using the unit > centimeters instead of meters. For further details please refer to > http://wiki.dbpedia.org/Datasets#h18-11. > > 5. The framework now resolves template redirects, making it possible to > cover all redirects to an infobox on Wikipedia with a single mapping. > > 6. Three new extractors have been implemented: 1. PageIdExtractor extracting > Wikipedia page IDs are extracted for each page. 2. RevisionExtractor > extracting the latest revision of a page. 3. PNDExtractor extracting PND > (Personnamendatei) identifiers. > > 7. The data set now provides labels, abstracts, page links and infobox data > in 92 different languages, which have been extracted from recent Wikipedia > dumps as of March 2010. > > 8. In addition the N-Triples datasets, N-Quads datasets are provided which > include a provenance URI to each statement. The provenance URI denotes the > origin of the extracted triple in Wikipedia (For details see: > http://wiki.dbpedia.org/Datasets#h18-18). > > You can download the new DBpedia dataset from > http://wiki.dbpedia.org/Downloads35. As usual, the data set is also > available as Linked Data and via the DBpedia SPARQL endpoint. > > Lots of thanks to: > > * Robert Isele, Anja Jentzsch, Christopher Sahnwaldt, and Paul Kreis (all > Freie Universität Berlin) for reimplementing the DBpedia extraction > framework in Scala, for extending the infobox-to-ontology mappings and for > extracting the new DBpedia 3.5 knowledge base. > > * Jens Lehmann and Sören Auer (both Universität Leipzig) for providing the > knowledge base via the DBpedia download server at Universität Leipzig. > > * Kingsley Idehen and Mitko Iliev (both OpenLink Software) for loading the > knowledge base into the Virtuoso instance that serves the Linked Data view > and SPARQL endpoint. > > The whole DBpedia team is very thankful to three companies which enabled us > to do all this by supporting and sponsoring the DBpedia project: > > * Neofonie GmbH (http://www.neofonie.de/index.jsp), a Berlin-based company > offering leading technologies in the area of Web search, social media and > mobile applications. > > * Vulcan Inc. as part of its Project Halo (www.projecthalo.com). Vulcan Inc. > creates and advances a variety of world-class endeavors and high impact > initiatives that change and improve the way we live, learn, do business > (http://www.vulcan.com/). > > * OpenLink Software (http://www.openlinksw.com/). OpenLink Software develops > the Virtuoso Universal Server, an innovative enterprise grade server that > cost-effectively delivers an unrivaled platform for Data Access, Integration > and Management. > > More information about DBpedia is found at http://dbpedia.org/About > > Have fun with the new DBpedia knowledge base! > > Cheers, > > Chris Bizer > > > -- > Prof. Dr. Christian Bizer > Web-based Systems Group > Freie Universität Berlin > +49 30 838 55509 > http://www.bizer.de > chris@bizer.de > > > > >
Received on Monday, 12 April 2010 13:20:22 UTC