- From: Marc <marc@geonames.org>
- Date: Tue, 30 Jan 2007 18:40:06 +0100
- To: Jo Walsh <jo@frot.org>
- CC: Bernard Vatant <bernard.vatant@mondeca.com>, Alexandre Passant <alex@passant.org>, public-xg-geo@w3.org
Hi Jo As Bernard already pointed out, geonames is using dozens or even hundreds of data sources. The goal and mission of geonames is to use the official and most accurate data set wherever possible and to aggregate the data sets and interlink between them. Most French communities for instance look like this, with links to wikipedia in several languages and a link to the INSEE (the French National Institute for Statistics and Economic Studies) : <wikipediaArticle rdf:resource="http://fr.wikipedia.org/wiki/Embrun_%28Hautes-Alpes%29"/> <wikipediaArticle rdf:resource="http://pl.wikipedia.org/wiki/Embrun"/> <wikipediaArticle rdf:resource="http://de.wikipedia.org/wiki/Embrun"/> <wikipediaArticle rdf:resource="http://en.wikipedia.org/wiki/Embrun%2C_Hautes-Alpes"/> <wikipediaArticle rdf:resource="http://it.wikipedia.org/wiki/Embrun"/> <wikipediaArticle rdf:resource="http://nl.wikipedia.org/wiki/Embrun"/> <owl:sameAs rdf:resource="http://rdf.insee.fr/geo/COM_05046"/> For the US the data is from the U.S. Geological Survey, for Canada it is from the Canadian government (geobase.ca ), for New Zealand from linz (land information New Zealand), for Brazil a lot is from Brasileiro de Geografia Estatística (IBGE) and so on. The more governments release data sets the better geonames will become. Cheers Marc Bernard Vatant wrote: > Jo Walsh a écrit : >> This sounds like an interesting and fun project. I would be wary of >> over-relying on geonames.org - the US mil/gov/int data from which is it >> derived can be very inaccurate in places and have names which are >> locally >> meaningless sometimes. Though geonames are refining and fixing it >> slowly... > Marc is certainly the best one to answer that and will correct me if I > am wrong, but I want to stress a couple of things: > > 1. geonames.org aims at federating public data, of which "quality and > accuracy may vary". But "US mil/gov/int data", if the main original > ones, are not the only source. In the future hopefully more accurate > data will be aggregated. We have started for France, by matching > geonames features to INSEE RDF data. > 2. geonames.org is a wiki-based collaborative project. If you think > some data is broken, just fix it. :-) Granted, there is a lot to do, > there are currently about 6,300,000 records in the data base, which > seems a lot, but actually that's only 1 record for 1000 Earth > inhabitants ;-) . Which means if stakeholders begin to come in the > loop (that means any user of geo data), distributed task force can > make a lot. It's just a question of bootstrapping. Improvement will > come from both incorporating more accurate public data, and more > fine-grained tuning of those data through the collaborative work. > Similar toWikipedia's process. > 3. Currently, I'm more concerned by the taxonomy of features > (so-called feature codes), which is indeed a single source one (US > mil/gov/int also), and thinking about transitioning to more open and > flexible concept scheme(s). For example, I'm currently working on a > matching of geonames feature codes to GEMET concepts wherever available. > > Stay tuned. That's just the beginning of it.
Received on Tuesday, 30 January 2007 17:44:10 UTC