Re: Wikipedia and Geonames. was: AW: ANN: RDF Book Mashup - Integrating Web 2.0 data sources like Amazon and Google into the Semantic Web

Richard
>> Around 100,000 geonames place names now have wikipedia links.
> Very cool. I wonder how you link the articles? Can't be simple word 
> matching, no? 
Simple word matching would lead to an incredible mess. There are for 
example 53 places with the name London and 58 places with the name Paris 
in the geonames database. Place name disambiguation is a rather hard 
problem and for matching geonames places with wikipedia articles we use 
semantic information in the wikipedia dump together with the article 
title. The semantic information primarily is latitude and longitude, but 
also country, administrative division, feature type, population and 
categories [1]. We only consider articles where we are able to parse 
semantic information [2]. Unfortunately there is a proliferation of 
templates and a lot of wikipedia users have fun inventing new ones 
instead of reusing existing ones.

Cheers

Marc

[1] http://www.geonames.org/maps/wikipedia.html
[2] http://www.geonames.org/wikipedia.html

Received on Tuesday, 5 December 2006 22:11:16 UTC