Re: [foaf-dev] FOAF, geonames, and more

Hi Jo

As Bernard already pointed out, geonames is using dozens or even 
hundreds of data sources. The goal and mission of geonames is to use the 
official and most accurate data set wherever possible and to aggregate 
the data sets and interlink between them. Most French communities for 
instance look like this, with links to wikipedia in several languages 
and a link to the INSEE (the French National Institute for Statistics 
and Economic Studies) :

        <wikipediaArticle rdf:resource="http://fr.wikipedia.org/wiki/Embrun_%28Hautes-Alpes%29"/>
        <wikipediaArticle rdf:resource="http://pl.wikipedia.org/wiki/Embrun"/>
        <wikipediaArticle rdf:resource="http://de.wikipedia.org/wiki/Embrun"/>
        <wikipediaArticle rdf:resource="http://en.wikipedia.org/wiki/Embrun%2C_Hautes-Alpes"/>
        <wikipediaArticle rdf:resource="http://it.wikipedia.org/wiki/Embrun"/>
        <wikipediaArticle rdf:resource="http://nl.wikipedia.org/wiki/Embrun"/>
        <owl:sameAs rdf:resource="http://rdf.insee.fr/geo/COM_05046"/>

For the US the data is from the U.S. Geological Survey, for Canada it is 
from the Canadian government (geobase.ca ), for New Zealand from linz 
(land information New Zealand), for Brazil a lot is from Brasileiro de 
Geografia Estatística (IBGE) and so on. The more governments release 
data sets the better geonames will become.

Cheers

Marc

Bernard Vatant wrote:
> Jo Walsh a écrit :
>> This sounds like an interesting and fun project. I would be wary of
>> over-relying on geonames.org - the US mil/gov/int data from which is it
>> derived can be very inaccurate in places and have names which are 
>> locally
>> meaningless sometimes. Though geonames are refining and fixing it 
>> slowly...  
> Marc is certainly the best one to answer that and will correct me if I 
> am wrong, but I want to stress a couple of things:
>
> 1. geonames.org aims at  federating public data, of which "quality and 
> accuracy may vary". But "US mil/gov/int data", if the main original 
> ones, are not the only source. In the future hopefully more accurate 
> data will be aggregated. We have started for France, by matching 
> geonames features to INSEE RDF data.
> 2. geonames.org is a wiki-based collaborative project. If you think 
> some data is broken, just fix it. :-) Granted, there is a lot to do, 
> there are currently about 6,300,000 records in the data base, which 
> seems a lot, but actually that's only 1 record for 1000 Earth 
> inhabitants ;-) . Which means if stakeholders begin to come in the 
> loop (that means any user of geo data), distributed task force can 
> make a lot. It's just a question of bootstrapping.  Improvement will 
> come from both  incorporating more accurate public data, and more 
> fine-grained tuning of those data through the collaborative work. 
> Similar toWikipedia's process.
> 3. Currently, I'm more concerned by the taxonomy of features 
> (so-called feature codes), which is indeed a single source one (US 
> mil/gov/int also), and thinking about transitioning to more open and 
> flexible concept scheme(s). For example, I'm currently working on a 
> matching of geonames feature codes to GEMET concepts wherever available.
>
> Stay tuned. That's just the beginning of it.

Received on Tuesday, 30 January 2007 17:44:10 UTC