W3C home > Mailing lists > Public > public-lod@w3.org > April 2010

Re: Semantic black holes at sameas.org Re: [GeoNames] LOD mappings

From: Hugh Glaser <hg@ecs.soton.ac.uk>
Date: Tue, 27 Apr 2010 02:56:06 +0000
To: Bernard Vatant <bernard.vatant@mondeca.com>, geonames <geonames@googlegroups.com>
CC: Linking Open Data <public-lod@w3.org>
Message-ID: <EMEW3|83f8d8591134973ce90c68a83bf99e02m3Q3uc02hg|ecs.soton.ac.uk|C7FC10D6.1212B%hg@ecs.soton.ac.uk>
Thanks Bernard.
Yes, I think the problems you raise are valid.
Just a short response.
In some sense I consider sameas.org to be a discovery service.
This is in contrast to a service that might be called something more definitive.
So I have taken quite a liberal view of what I will accept on the site.
We have other services that are much more conservative in their view; in particular the ones we use for RKBExplorer.
So what we are trying to do is capture a spectrum of views of what constitutes equivalence, which will always be a moveable feast.
Best
Hugh

On 23/04/2010 16:14, "Bernard Vatant" <bernard.vatant@mondeca.com> wrote:

Alexander :

It would be useful to have a list of currently available mappings to GeoNames. It would be useful not only for people like me who create custom RDF datasets but also for people who want to contribute additional mappings.

Seems a good idea

Daniel :

Re-publish your data with rdfs:seeAlso http://sameas.org/rdf?uri=http%3A%2F%2Fsws.geonames.org%2F2078025%2F perhaps?

This seems like a good idea. Considering that geonames.org <http://geonames.org>  cannot dedicate (m)any resources to LOD mappings, those can be deferred to external services such as sameas.org <http://sameas.org> . The sameas.org <http://sameas.org>  URI is easy to generate automatically from the geonames id.

So far so good. But let's look at it closely. Someone has to feed this kind of recursive and iterative social process happening at sameas.org <http://sameas.org> , but there is no provenance track, and the clustering of URIs will make with the time the concepts more and more fuzzy, and sameas.org <http://sameas.org>  a tool to create semantic black holes.

It would be definitely better to have some clear declaration from Geonames viewpoint which of its three URIs for Berlin
http://sws.geonames.org/2950159/, http://sws.geonames.org/6547383/ or http://sws.geonames.org/6547539/ should map to http://dbpedia.org/resource/Berlin. So far, neither does.

>From DBpedia side owl:sameAs declarations at the latter URI are as following (today)

  *   opencyc:en/Berlin_StateGermany <http://sw.opencyc.org/2008/06/10/concept/Mx4rv77EfZwpEbGdrcN5Y29ycA>
  *   fbase:Berlin <http://rdf.freebase.com/ns/guid.9202a8c04000641f80000000000094d6>
  *   http://umbel.org/umbel/ne/wikipedia/Berlin
  *   opencyc:en/CityOfBerlinGermany <http://sw.opencyc.org/2008/06/10/concept/Mx4rvVjrhpwpEbGdrcN5Y29ycA>
  *   http://www4.wiwiss.fu-berlin.de/eurostat/resource/regions/Berlin
  *   http://sws.geonames.org/2950159/
  *   http://data.nytimes.com/N50987186835223032381

So it seems DBpedia has decided to map its Berlin to the Geonames feature of type "capital of a political entity", subtype of "populated place". Why not? OTOH it also declares two equivalent in opencyc, one being a state and the other a city. If opencyc buys the DBpedia declarations, the semantic collapse begins

Let's go yet closer to the black hole horizon ...

http://sameas.org/html?uri=http%3A%2F%2Fdbpedia.org%2Fresource%2FBerlin

... yields 29 URIs including the previous ones ...

If geonames.org <http://geonames.org>  had taken the time to map carefully its administrative features on the respective "city" and "state" opencyc resources, the three different URIs carefully coined to make distinct entities for Berlin as a populated place and the two administrative subdivisions bearing the same name, would be by the grace of DBpedia fuzziness crushed in the same sameas.org <http://sameas.org>  semantic black hole.

Bottom line. Given the current state of affairs for geographical entities in the linked data cloud, geonames agnosticism re. owl:sameAs is rather a good thing. There are certainly more subtle ways to link geo entities at various level of granularity, and a lot of work to achieve semantic interoperability of geo entities defined everywhere. Things are moving forward, but it will be a long way and needin a lot of resources. Look e.g., at Yahoo! concordance http://developer.yahoo.com/geo/geoplanet/guide/api-reference.html#api-concordance, which BTW also links to geonames id.

In conclusion:

YES Marc Wick is right to currently focus on data and data quality first. A tremendous set of data is available for free, take what you can and what you wish and build on it. If you want premium services, pay for it. Fair enough.

YES it should be great to have geonames data/URIs more integrated, and better to the linked data economy. More complete descriptions at sws.geonames URIs, SPARQL endpoint etc. Bearing in mind that Geonames.org has no dedicated resources for it, who will care of that in a scalable way? What is the business model? Good questions. Volunteers, step forward :)

Bernard
Received on Tuesday, 27 April 2010 02:57:11 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:26 UTC