W3C home > Mailing lists > Public > public-lod@w3.org > May 2010

Re: Semantic black holes at sameas.org Re: [GeoNames] LOD mappings

From: Hugh Glaser <hg@ecs.soton.ac.uk>
Date: Wed, 12 May 2010 20:00:32 +0000
To: Bernard Vatant <bernard.vatant@mondeca.com>
CC: geonames <geonames@googlegroups.com>, Linking Open Data <public-lod@w3.org>
Message-ID: <EMEW3|08faed1600f6cb7ed459731e2f0763b6m4BL0v02hg|ecs.soton.ac.uk|C80F90D6.1413B%hg@ecs.soton.ac.uk>
Thanks Bernard,
Always good to be discussing things around a common basis.
Sorry for the slow reply  very intermittent connectivity.

On 27/04/2010 16:38, "Bernard Vatant" <bernard.vatant@mondeca.com> wrote:

> Hi Hugh
> 
> 2010/4/27 Hugh Glaser <hg@ecs.soton.ac.uk>
>> Thanks Bernard.
>> Yes, I think the problems you raise are valid.
>> Just a short response.
>> In some sense I consider sameas.org <http://sameas.org>  to be a discovery
>> service.
> 
> Indeed, so do I. The known issue is the overload of owl:sameAs, but you have
> an excellent presentation today of Pat Hayes and Harry Halpin just coming ...
> (you are at ldow2010 I guess)
I was, and enjoyed it.
>  
>> This is in contrast to a service that might be called something more
>> definitive.
>> So I have taken quite a liberal view of what I will accept on the site.
>> We have other services that are much more conservative in their view; in
>> particular the ones we use for RKBExplorer.
>> So what we are trying to do is capture a spectrum of views of what
>> constitutes equivalence, which will always be a moveable feast.
> 
> Agreed with all that. Maybe you could introduce a sameas ontology for
> different flavours of equivalence, containing a single property sameas:sameas
> of which owl:sameAs; owl:equivalent*, skos:*Match ... would be subproperties.
> In that case the "liberal" clustering would use sameas:sameas and the more
> conservative ones whatever fits.
It is interesting you should say that.
For many years we gathered equivalence information and republished against a
more complex ontology - that was the "right" way to do it.
Eventually I decided that the right way to do it was not the socio-technical
right way to do it, and added the owl:sameas predicate and named the site
sameas.org.
(Although you can still ask sameas.org to give you a different predicate if
you want, I think, but can't check.)
I know using owl:sameas is the "wrong" way to do it, but when we did it the
right way no-one showed any interest.
I think we needed to go through this phase of publishing as owl:sameas, so
that people would discover that it is problematic.

In a more reflective state, I might observe the following:
We have a community who work in URI equivalence who are looking for an
ontology to capture the concepts. If I was to be called in as a consultant
to this community I might go through a whole process of Knowledge
Acquisition processes, using a bunch of tools.
It is interesting that the URI community seems unable to capture the
knowledge into a useable ontology, when perhaps we expect everyone else to
be able to do the same thing for their areas of expertise.

What would a knowledge acquisition expert recommend for this ontology, which
we seem to think should be quite simply?

Best
Hugh

> 
> BTW currently working in connection with Gerard de Melo at http://lexvo.org
> re. semiotic approach to this issue, connecting vocabulary resources
> (concepts, classes, whatever) through the terms they use. You might bring that
> on ldow forum.
> 
> Have fun
Thanks: I am - on holiday!
> 
> Bernard
> 
>  
>> Best
>> Hugh
>> 
>> On 23/04/2010 16:14, "Bernard Vatant" <bernard.vatant@mondeca.com> wrote:
>> 
>> Alexander :
>> 
>> It would be useful to have a list of currently available mappings to
>> GeoNames. It would be useful not only for people like me who create custom
>> RDF datasets but also for people who want to contribute additional mappings.
>> 
>> Seems a good idea
>> 
>> Daniel :
>> 
>> Re-publish your data with rdfs:seeAlso
>> http://sameas.org/rdf?uri=http%3A%2F%2Fsws.geonames.org%2F2078025%2F perhaps?
>> 
>> This seems like a good idea. Considering that geonames.org
>> <http://geonames.org>  <http://geonames.org>  cannot dedicate (m)any
>> resources to LOD mappings, those can be deferred to external services such as
>> sameas.org <http://sameas.org>  <http://sameas.org> . The sameas.org
>> <http://sameas.org>  <http://sameas.org>  URI is easy to generate
>> automatically from the geonames id.
>> 
>> So far so good. But let's look at it closely. Someone has to feed this kind
>> of recursive and iterative social process happening at sameas.org
>> <http://sameas.org>  <http://sameas.org> , but there is no provenance track,
>> and the clustering of URIs will make with the time the concepts more and more
>> fuzzy, and sameas.org <http://sameas.org>  <http://sameas.org>  a tool to
>> create semantic black holes.
>> 
>> It would be definitely better to have some clear declaration from Geonames
>> viewpoint which of its three URIs for Berlin
>> http://sws.geonames.org/2950159/, http://sws.geonames.org/6547383/ or
>> http://sws.geonames.org/6547539/ should map to
>> http://dbpedia.org/resource/Berlin. So far, neither does.
>> 
>> From DBpedia side owl:sameAs declarations at the latter URI are as following
>> (today)
>> 
>>   *   opencyc:en/Berlin_StateGermany
>> <http://sw.opencyc.org/2008/06/10/concept/Mx4rv77EfZwpEbGdrcN5Y29ycA>
>>   *   fbase:Berlin
>> <http://rdf.freebase.com/ns/guid.9202a8c04000641f80000000000094d6>
>>   *   http://umbel.org/umbel/ne/wikipedia/Berlin
>>   *   opencyc:en/CityOfBerlinGermany
>> <http://sw.opencyc.org/2008/06/10/concept/Mx4rvVjrhpwpEbGdrcN5Y29ycA>
>>   *   http://www4.wiwiss.fu-berlin.de/eurostat/resource/regions/Berlin
>>   *   http://sws.geonames.org/2950159/
>>   *   http://data.nytimes.com/N50987186835223032381
>> 
>> So it seems DBpedia has decided to map its Berlin to the Geonames feature of
>> type "capital of a political entity", subtype of "populated place". Why not?
>> OTOH it also declares two equivalent in opencyc, one being a state and the
>> other a city. If opencyc buys the DBpedia declarations, the semantic collapse
>> begins
>> 
>> Let's go yet closer to the black hole horizon ...
>> 
>> http://sameas.org/html?uri=http%3A%2F%2Fdbpedia.org%2Fresource%2FBerlin
>> 
>> ... yields 29 URIs including the previous ones ...
>> 
>> If geonames.org <http://geonames.org>  <http://geonames.org>  had taken the
>> time to map carefully its administrative features on the respective "city"
>> and "state" opencyc resources, the three different URIs carefully coined to
>> make distinct entities for Berlin as a populated place and the two
>> administrative subdivisions bearing the same name, would be by the grace of
>> DBpedia fuzziness crushed in the same sameas.org <http://sameas.org>
>> <http://sameas.org>  semantic black hole.
>> 
>> Bottom line. Given the current state of affairs for geographical entities in
>> the linked data cloud, geonames agnosticism re. owl:sameAs is rather a good
>> thing. There are certainly more subtle ways to link geo entities at various
>> level of granularity, and a lot of work to achieve semantic interoperability
>> of geo entities defined everywhere. Things are moving forward, but it will be
>> a long way and needin a lot of resources. Look e.g., at Yahoo! concordance
>> http://developer.yahoo.com/geo/geoplanet/guide/api-reference.html#api-concord
>> ance, which BTW also links to geonames id.
>> 
>> In conclusion:
>> 
>> YES Marc Wick is right to currently focus on data and data quality first. A
>> tremendous set of data is available for free, take what you can and what you
>> wish and build on it. If you want premium services, pay for it. Fair enough.
>> 
>> YES it should be great to have geonames data/URIs more integrated, and better
>> to the linked data economy. More complete descriptions at sws.geonames URIs,
>> SPARQL endpoint etc. Bearing in mind that Geonames.org has no dedicated
>> resources for it, who will care of that in a scalable way? What is the
>> business model? Good questions. Volunteers, step forward :)
>> 
>> Bernard
> 
> 
Received on Wednesday, 12 May 2010 20:01:34 UTC

This archive was generated by hypermail 2.4.0 : Thursday, 24 March 2022 20:29:48 UTC