- From: Marc <marc@geonames.org>
- Date: Tue, 05 Dec 2006 23:11:01 +0100
- To: Richard Cyganiak <richard@cyganiak.de>
- CC: Chris Bizer <chris@bizer.de>, Semantic Web <semantic-web@w3.org>
Richard >> Around 100,000 geonames place names now have wikipedia links. > Very cool. I wonder how you link the articles? Can't be simple word > matching, no? Simple word matching would lead to an incredible mess. There are for example 53 places with the name London and 58 places with the name Paris in the geonames database. Place name disambiguation is a rather hard problem and for matching geonames places with wikipedia articles we use semantic information in the wikipedia dump together with the article title. The semantic information primarily is latitude and longitude, but also country, administrative division, feature type, population and categories [1]. We only consider articles where we are able to parse semantic information [2]. Unfortunately there is a proliferation of templates and a lot of wikipedia users have fun inventing new ones instead of reusing existing ones. Cheers Marc [1] http://www.geonames.org/maps/wikipedia.html [2] http://www.geonames.org/wikipedia.html
Received on Tuesday, 5 December 2006 22:11:16 UTC