Linking Geonames RDF to INSEE RDF - rdfs:seeAlso, rdfs:isDefinedBy, owl:sameAs ... what else?

A use case for linked data ...

Trying to connect Geonames "features" in France with geographical 
entities published by INSEE last summer.
I have e.g. for the region "Bretagne" the respective URIs  
http://sws.geonames.org/3030293/  and   http://rdf.insee.fr/geo/REG_53

What should be the recommended practice if the declarations are made 
from Geonames side, in files published in Geonames namespace, and not 
necessarily endorsed by INSEE?

The least commitment is
http://sws.geonames.org/3030293/      rdfs:seeAlso      
http://rdf.insee.fr/geo/REG_53
It does not implies any actual semantic identification of Bretagne as 
defined by Geonames and Bretagne as defined by INSEE. Any interpretation 
is possible.
The commitment is too weak if I want to say that both resources defined 
the same thing, somehow.

A higher level of commitment is to declare from Geonames side that INSEE 
data are authoritative, so one would think about the following
http://sws.geonames.org/3030293/      rdfs:isDefinedBy      
http://rdf.insee.fr/geo/REG_53
But actually this is wrong, since the INSEE resource does not define at 
all the Geonames resource.

After that, the only choice seems to be
http://sws.geonames.org/3030293/      owl:sameAs      
http://rdf.insee.fr/geo/REG_53
This is a very strong commitment indeed. It means all assertions made on 
one resource hold for the other. It means I have checked and will 
continue to check that what is declared on Geonames side is consistent 
with what is declared on INSEE side. And moreover, since owl:sameAs is 
symmetrical, it adds semantics to INSEE resources that they have not 
endorsed and may disagree with.

So it seems we have the choice between something too weak, open to any 
interpretation, and something too strong.

I have proposed at some point [1], but this proposal seems to have met 
so far a polite refusal from the community, to use something in-between 
by indirecly linking the two resources to the same blank node, using 
dc:subject.  It's stronger than rdfs:seeAlso, and weaker than owl:sameAs.

http://sws.geonames.org/3030293/      dc:subject      _:b
http://rdf.insee.fr/geo/REG_53     dc:subject      _:b

The idea behind this is that neither Geonames resource nor INSEE 
resource are definitive or exhaustive. They provide two identifiers and 
two descriptions of the same subject, but this declaration says that 
this subject is not "definitely defined" by any of those resources, 
which might be in their current state inconsistent or out of sync, and 
then are not technically the same in the sense of owl:sameAs.

Moreover, this mechanism allows to aggregate other, non-RDF resources
http://en.wikipedia.org/wiki/Bretagne     dc:subject      _:b

Of course if that seems a weird hacking of dc:subject, some other 
property could be proposed. But I think the SW needs something of the 
like anyway.

[1] 
http://universimmedia.blogspot.com/2006/04/identifying-things-blank-nodes-again.html

-- 

*Bernard Vatant
*Knowledge Engineering
----------------------------------------------------
*Mondeca**
*3, cité Nollez 75018 Paris France
Web:    www.mondeca.com <http://www.mondeca.com>
----------------------------------------------------
Tel:       +33 (0) 871 488 459
Mail:     bernard.vatant@mondeca.com <mailto:bernard.vatant@mondeca.com>
Blog:    Leçons de Choses <http://mondeca.wordpress.com/>

Received on Tuesday, 31 October 2006 09:32:42 UTC