Re: Semantic black holes at sameas.org Re: [GeoNames] LOD mappings

From: Adrian Walker <adriandwalker@gmail.com>
Date: Wed, 12 May 2010 16:40:26 -0400
To: geonames <geonames@googlegroups.com>, Linking Open Data <public-lod@w3.org>
Hi Bernard, Hugh and All,

The example in which there are three distinct things called Berlin raises an
interesting question.

The question is, why would it be a good idea to try to make sense of this at
the data + ontology level?

I ask because, implicit in the attempt lurks the notion that there is a way
of doing this that will be 'correct' for all future applications over the
data + ontology.

That seems just plain wrong, even for this simple example.

The inconvenient truth is that applications add semantics to the data +
ontology layers.  Shifting representations to recognize this could be hugely


Just my two cents.

                        -- Adrian

Internet Business Logic
A Wiki and SOA Endpoint for Executable Open Vocabulary English over SQL and
Online at www.reengineeringllc.com
Shared use is free, and there are no advertisements

Adrian Walker

On Tue, Apr 27, 2010 at 11:38 AM, Bernard Vatant <bernard.vatant@mondeca.com
> wrote:

> Hi Hugh
> 2010/4/27 Hugh Glaser <hg@ecs.soton.ac.uk>
>> Thanks Bernard.
>> Yes, I think the problems you raise are valid.
>> Just a short response.
>> In some sense I consider sameas.org to be a discovery service.
> Indeed, so do I. The known issue is the overload of owl:sameAs, but you
> have an excellent presentation today of Pat Hayes and Harry Halpin just
> coming ... (you are at ldow2010 I guess)
>> This is in contrast to a service that might be called something more
>> definitive.
>> So I have taken quite a liberal view of what I will accept on the site.
> We have other services that are much more conservative in their view; in
>> particular the ones we use for RKBExplorer.
>> So what we are trying to do is capture a spectrum of views of what
>> constitutes equivalence, which will always be a moveable feast.
> Agreed with all that. Maybe you could introduce a sameas ontology for
> different flavours of equivalence, containing a single property
> sameas:sameas  of which owl:sameAs; owl:equivalent*, skos:*Match ... would
> be subproperties. In that case the "liberal" clustering would use
> sameas:sameas and the more conservative ones whatever fits.
> BTW currently working in connection with Gerard de Melo at
> http://lexvo.org re. semiotic approach to this issue, connecting
> vocabulary resources (concepts, classes, whatever) through the terms they
> use. You might bring that on ldow forum.
> Have fun
> Bernard
>> Best
>> Hugh
>> On 23/04/2010 16:14, "Bernard Vatant" <bernard.vatant@mondeca.com> wrote:
>> Alexander :
>> It would be useful to have a list of currently available mappings to
>> GeoNames. It would be useful not only for people like me who create custom
>> RDF datasets but also for people who want to contribute additional mappings.
>> Seems a good idea
>> Daniel :
>> Re-publish your data with rdfs:seeAlso
>> http://sameas.org/rdf?uri=http%3A%2F%2Fsws.geonames.org%2F2078025%2Fperhaps?
>> This seems like a good idea. Considering that geonames.org <
>> http://geonames.org>  cannot dedicate (m)any resources to LOD mappings,
>> those can be deferred to external services such as sameas.org <
>> http://sameas.org> . The sameas.org <http://sameas.org>  URI is easy to
>> generate automatically from the geonames id.
>> So far so good. But let's look at it closely. Someone has to feed this
>> kind of recursive and iterative social process happening at sameas.org <
>> http://sameas.org> , but there is no provenance track, and the clustering
>> of URIs will make with the time the concepts more and more fuzzy, and
>> sameas.org <http://sameas.org>  a tool to create semantic black holes.
>> It would be definitely better to have some clear declaration from Geonames
>> viewpoint which of its three URIs for Berlin
>> http://sws.geonames.org/2950159/, http://sws.geonames.org/6547383/ or
>> http://sws.geonames.org/6547539/ should map to
>> http://dbpedia.org/resource/Berlin. So far, neither does.
>> >From DBpedia side owl:sameAs declarations at the latter URI are as
>> following (today)
>>   *   opencyc:en/Berlin_StateGermany <
>> http://sw.opencyc.org/2008/06/10/concept/Mx4rv77EfZwpEbGdrcN5Y29ycA>
>>  *   fbase:Berlin <
>> http://rdf.freebase.com/ns/guid.9202a8c04000641f80000000000094d6>
>>  *   http://umbel.org/umbel/ne/wikipedia/Berlin
>>  *   opencyc:en/CityOfBerlinGermany <
>> http://sw.opencyc.org/2008/06/10/concept/Mx4rvVjrhpwpEbGdrcN5Y29ycA>
>>  *   http://www4.wiwiss.fu-berlin.de/eurostat/resource/regions/Berlin
>>   *   http://sws.geonames.org/2950159/
>>  *   http://data.nytimes.com/N50987186835223032381
>> So it seems DBpedia has decided to map its Berlin to the Geonames feature
>> of type "capital of a political entity", subtype of "populated place". Why
>> not? OTOH it also declares two equivalent in opencyc, one being a state and
>> the other a city. If opencyc buys the DBpedia declarations, the semantic
>> collapse begins
>> Let's go yet closer to the black hole horizon ...
>> http://sameas.org/html?uri=http%3A%2F%2Fdbpedia.org%2Fresource%2FBerlin
>> ... yields 29 URIs including the previous ones ...
>> If geonames.org <http://geonames.org>  had taken the time to map
>> carefully its administrative features on the respective "city" and "state"
>> opencyc resources, the three different URIs carefully coined to make
>> distinct entities for Berlin as a populated place and the two administrative
>> subdivisions bearing the same name, would be by the grace of DBpedia
>> fuzziness crushed in the same sameas.org <http://sameas.org>  semantic
>> black hole.
>> Bottom line. Given the current state of affairs for geographical entities
>> in the linked data cloud, geonames agnosticism re. owl:sameAs is rather a
>> good thing. There are certainly more subtle ways to link geo entities at
>> various level of granularity, and a lot of work to achieve semantic
>> interoperability of geo entities defined everywhere. Things are moving
>> forward, but it will be a long way and needin a lot of resources. Look e.g.,
>> at Yahoo! concordance
>> http://developer.yahoo.com/geo/geoplanet/guide/api-reference.html#api-concordance,
>> which BTW also links to geonames id.
>> In conclusion:
>> YES Marc Wick is right to currently focus on data and data quality first.
>> A tremendous set of data is available for free, take what you can and what
>> you wish and build on it. If you want premium services, pay for it. Fair
>> enough.
>> YES it should be great to have geonames data/URIs more integrated, and
>> better to the linked data economy. More complete descriptions at
>> sws.geonames URIs, SPARQL endpoint etc. Bearing in mind that Geonames.org
>> has no dedicated resources for it, who will care of that in a scalable way?
>> What is the business model? Good questions. Volunteers, step forward :)
>> Bernard
--
Bernard Vatant
Senior Consultant
Vocabulary & Data Engineering
Tel:       +33 (0) 971 488 459
Mail:     bernard.vatant@mondeca.com
----------------------------------------------------
Mondeca
3, cité Nollez 75018 Paris France
Web:    http://www.mondeca.com
Blog:    http://mondeca.wordpress.com
----------------------------------------------------
Received on Wednesday, 12 May 2010 20:40:55 UTC

