W3C home > Mailing lists > Public > semantic-web@w3.org > December 2006

Re: Wikipedia and Geonames. was: AW: ANN: RDF Book Mashup - Integrating Web 2.0 data sources like Amazon and Google into the Semantic Web

From: Marc <marc@geonames.org>
Date: Tue, 05 Dec 2006 23:11:01 +0100
Message-ID: <4575EE75.2060904@geonames.org>
To: Richard Cyganiak <richard@cyganiak.de>
CC: Chris Bizer <chris@bizer.de>, Semantic Web <semantic-web@w3.org>

Richard
>> Around 100,000 geonames place names now have wikipedia links.
> Very cool. I wonder how you link the articles? Can't be simple word 
> matching, no? 
Simple word matching would lead to an incredible mess. There are for 
example 53 places with the name London and 58 places with the name Paris 
in the geonames database. Place name disambiguation is a rather hard 
problem and for matching geonames places with wikipedia articles we use 
semantic information in the wikipedia dump together with the article 
title. The semantic information primarily is latitude and longitude, but 
also country, administrative division, feature type, population and 
categories [1]. We only consider articles where we are able to parse 
semantic information [2]. Unfortunately there is a proliferation of 
templates and a lot of wikipedia users have fun inventing new ones 
instead of reusing existing ones.

Cheers

Marc

[1] http://www.geonames.org/maps/wikipedia.html
[2] http://www.geonames.org/wikipedia.html
Received on Tuesday, 5 December 2006 22:11:16 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 1 March 2016 07:41:54 UTC