W3C home > Mailing lists > Public > public-lod@w3.org > April 2010

Semantic black holes at sameas.org Re: [GeoNames] LOD mappings

From: Bernard Vatant <bernard.vatant@mondeca.com>
Date: Fri, 23 Apr 2010 17:14:44 +0200
Message-ID: <o2h9d93ef961004230814ge8abdfb0yeee674ff809de95c@mail.gmail.com>
To: geonames <geonames@googlegroups.com>
Cc: Linking Open Data <public-lod@w3.org>
Alexander :

It would be useful to have a list of currently available mappings to
>> GeoNames. It would be useful not only for people like me who create custom
>> RDF datasets but also for people who want to contribute additional mappings.
>>
>
Seems a good idea

Daniel :


> Re-publish your data with rdfs:seeAlso
> http://sameas.org/rdf?uri=http%3A%2F%2Fsws.geonames.org%2F2078025%2Fperhaps?


This seems like a good idea. Considering that geonames.org cannot dedicate
(m)any resources to LOD mappings, those can be deferred to external services
such as sameas.org. The sameas.org URI is easy to generate automatically
from the geonames id.

So far so good. But let's look at it closely. Someone has to feed this kind
of recursive and iterative social process happening at sameas.org, but there
is no provenance track, and the clustering of URIs will make with the time
the concepts more and more fuzzy, and sameas.org a tool to create semantic
black holes.

It would be definitely better to have some clear declaration from Geonames
viewpoint which of its three URIs for Berlin
http://sws.geonames.org/2950159/, http://sws.geonames.org/6547383/ or
http://sws.geonames.org/6547539/ should map to
http://dbpedia.org/resource/Berlin. So far, neither does.

>From DBpedia side owl:sameAs declarations at the latter URI are as following
(today)

   - opencyc:en/Berlin_StateGermany<http://sw.opencyc.org/2008/06/10/concept/Mx4rv77EfZwpEbGdrcN5Y29ycA>
   - fbase:Berlin<http://rdf.freebase.com/ns/guid.9202a8c04000641f80000000000094d6>
   - http://umbel.org/umbel/ne/wikipedia/Berlin
   - opencyc:en/CityOfBerlinGermany<http://sw.opencyc.org/2008/06/10/concept/Mx4rvVjrhpwpEbGdrcN5Y29ycA>
   - http://www4.wiwiss.fu-berlin.de/eurostat/resource/regions/Berlin
   - http://sws.geonames.org/2950159/
   - http://data.nytimes.com/N50987186835223032381

So it seems DBpedia has decided to map its Berlin to the Geonames feature of
type "capital of a political entity", subtype of "populated place". Why not?
OTOH it also declares two equivalent in opencyc, one being a state and the
other a city. If opencyc buys the DBpedia declarations, the semantic
collapse begins

Let's go yet closer to the black hole horizon ...

http://sameas.org/html?uri=http%3A%2F%2Fdbpedia.org%2Fresource%2FBerlin

... yields 29 URIs including the previous ones ...

If geonames.org had taken the time to map carefully its administrative
features on the respective "city" and "state" opencyc resources, the three
different URIs carefully coined to make distinct entities for Berlin as a
populated place and the two administrative subdivisions bearing the same
name, would be by the grace of DBpedia fuzziness crushed in the same
sameas.org semantic black hole.

Bottom line. Given the current state of affairs for geographical entities in
the linked data cloud, geonames agnosticism re. owl:sameAs is rather a good
thing. There are certainly more subtle ways to link geo entities at various
level of granularity, and a lot of work to achieve semantic interoperability
of geo entities defined everywhere. Things are moving forward, but it will
be a long way and needin a lot of resources. Look e.g., at Yahoo!
concordance
http://developer.yahoo.com/geo/geoplanet/guide/api-reference.html#api-concordance,
which BTW also links to geonames id.

In conclusion:

YES Marc Wick is right to currently focus on data and data quality first. A
tremendous set of data is available for free, take what you can and what you
wish and build on it. If you want premium services, pay for it. Fair enough.


YES it should be great to have geonames data/URIs more integrated, and
better to the linked data economy. More complete descriptions at
sws.geonames URIs, SPARQL endpoint etc. Bearing in mind that Geonames.org
has no dedicated resources for it, who will care of that in a scalable way?
What is the business model? Good questions. Volunteers, step forward :)

Bernard

-- 
Bernard Vatant
Senior Consultant
Vocabulary & Data Engineering
Tel:       +33 (0) 971 488 459
Mail:     bernard.vatant@mondeca.com
----------------------------------------------------
Mondeca
3, cité Nollez 75018 Paris France
Web:    http://www.mondeca.com
Blog:    http://mondeca.wordpress.com
----------------------------------------------------
Received on Friday, 23 April 2010 15:15:18 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:26 UTC