Re: Guessing geographical locations for Wikipedia article subjects

* Karl Dubost wrote:
>Le 18 juin 2010 à 16:58, Bjoern Hoehrmann a écrit :
>>  Many Wikipedia articles have associated geographical coordinates. That
>> is however usually limited to stationary objects, even though many more
>> things have a very strong connection to a particular region, a regional
>> tradition for instance. In order to associate some location with more of
>> the articles, some method to derive them is needed.
>
>In your work you are trying to get real geolocation of articles (if I 
>understood correctly). I was wondering now about two possible interesting 
>things:

I am mainly interested in placing articles into regions where Wikipedia
does not do that already, for instance, I would want to put the article
on the danish language somewhere in denmark.

>* Plotting a dot for each geo-localized article on a world map.
>  That would give a kind of coverage of where are the things we talk 
>  about on Wikipedia. This could also create cartogram map where the 
>  density of article shapes the map rendering.

I've done that already, these two maps show the article density around
the north german city of Schleswig, using Google Maps and a generic
OpenLayers layer respectively,

  * http://www.websitedev.de/temp/dewp-artikeldichte-um-schleswig.html
  * http://www.websitedev.de/temp/openlayers-heatmap-layer.html

Doing this for all articles would push the file size to around 5MB and
browsers aren't really up to the task of drawing 200 000 rectangles,
and I haven't found a proper public domain OpenLayers compatible map
service so I could at least make a static image of it.

>* Distance between articles. 
>  What are articles situated at 10km around from this article X? The 
>  knowledge is then expressed in terms of distances.

Well that's the same thing really, except with added markers. I could
not find a proper Layer that would generate a usable result with the
maps above, so I am not doing that at the moment.

A while ago I made an interactive application that lets you browse the
category system of the german Wikipedia showing how many articles are
in a category (transitively) and how many page views they get in a tree
map (you can size and color the categories using various data); it's in
german and available at

  * http://katograph.appspot.com/

Note that Adobe Flash in some recent version is required. And as I am
collecting links,

  * http://search.cpan.org/dist/Geo-MedianCenter-XS/

has the code I used to make the median and distance calculations.
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de
25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 

Received on Sunday, 20 June 2010 02:28:31 UTC