Re: Geonames enters the Semantic Web from Richard Cyganiak on 2006-10-17 (semantic-web@w3.org from October 2006)

From: Richard Cyganiak <richard@cyganiak.de>
Date: Tue, 17 Oct 2006 15:17:06 +0200
To: Bernard vatant <bernard.vatant@mondeca.com>
Cc: Semantic Web <semantic-web@w3.org>, Marc <marc@geonames.org>
Message-Id: <E610C635-C827-4817-9E44-850C76ABD1A9@cyganiak.de>
Bernard,

On 17 Oct 2006, at 00:06, Bernard vatant wrote:
> We need a distinct URI to identify a geonames Feature, e.g.,
> (1)  http://ws.geonames.org/rdf#geonameId=3014258          
> rdf:type            geonames:Feature
> This URI we have not yet, but will have as soon as we are sure we  
> understand the issue  completeley right.
> It will have a 303 redirect to the current
> (2) http://ws.geonames.org/rdf?geonameId=3014258
> of which content will be modified accordingly to contain a  
> description of (1)

You have to either do "the hash thing" or "the 303 thing" (see [1]),  
combining both like this won't work because you'd have to set up a  
document http://ws.geonames.org/rdf which contains the descriptions  
of *all* your concepts.

I think this is all you'd have to do to get the TAG stamp of approval.

The other thing -- RDF backlinks -- are about creating *linked data*,  
as TimBL calls it [2]. Linked data is important to make RDF data  
consumable by Semantic Web browsers and crawlers and dynamic-dataset  
query engines, but it's an area of active experimentation and there  
are few clearly defined standards or best practices.

> OK. But since (2) is actually the result of a query on a data base  
> through a web service, we could add several parameters, like in  
> other geonames web services, like including or not in the  
> description the  related  features (parentFeature, childFeature,  
> nearbyFeature ...), limiting the number of related features,  
> languages of attributes, whatever.

Adding parameters to the query interface is certainly a good thing.  
But remember that RDF data is intended for *machine* consumption, and  
a machine (such as an RDF browser or crawler) will have no simple way  
of finding out what parameters are available or what parameters would  
be sensible in a given situation. Thus I think it doesn't really make  
the data more useful on the Semantic Web.

> Why so? Because, for example, a "complete" description at http:// 
> ws.geonames.org/rdf?geonameId=3017382 would contain each  
> description of each feature in France - which is a lot.

Hm ... why?

First, there's no need to include the *description* of the features  
in France. *Linking* to the features using a geonames:childPlace  
property is all you need; a processor that wants more information can  
follow the link and fetch the description.

Second, I believe the Geonames data is hierarchical, so you would  
only need to include links to features on the next-lower level (the  
régions?). A processor that want all levels can just follow the link.

Third, I want to echo one of the points from Tim's comment: You might  
consider to put the lists of the childPlaces and nearbyPlaces of  
France into documents separate from the main description of France,  
and point to the lists using rdfs:seeAlso or a subproperty thereof  
(e.g. geonames:childPlaceList). Example (imagine the appropriate URIs  
between the angle brackets):

<franceDocument> would contain:

     <franceConcept> geonames:name "France" .
     <franceConcept> geonames:parentPlace <europeConcept> .
     <franceConcept> geonames:childPlaceList  
<franceChildplacesDocument> .
     <franceConcept> geonames:neighbourPlaceList  
<franceNeighboursDocument> .
     ...

<franceChildplacesDocument> would contain:

     <franceConcept> geonames:childPlace <alsaceConcept> .
     <franceConcept> geonames:childPlace <aquitaineConcept> .
     ...

Be sure to note the distinction between documents and concepts. Some  
properties point to concepts, others to documents. But those pointing  
to other documents provide additional information about the current  
concept, and are just there to partition the data into more  
manageable chunks.

> Maybe some user would like that kind of description to feed a data  
> base, others would like only the direct children of type A.ADM1 etc.

Note that SPARQL provides a better and standardized solution to that  
kind of problem. AFAIK you already provide a database dump of the  
Geonames data; third parties could use this to set up a SPARQL server  
(e.g. using D2R Server [3]). Or, if your data is linked up properly,  
use a dynamic-dataset SPARQL engine (e.g. using the SemWebClient  
library [4]).

> So we are likely to deliver various RDF descriptions of the *same*  
> feature (1) at various URIs such as
> (3) http://ws.geonames.org/rdf? 
> geonameId=3014258&childFeatures=true&maxChildrenFeatures=50
> (4) http://ws.geonames.org/rdf? 
> geonameId=3014258&nearbyFeatures=true&maxNearbyFeatures=5 <http:// 
> ws.geonames.org/rdf?geonameId=3014258>

As I said, generic Semantic Web clients will have a hard time to  
automatically discover the alternate documents or to decide which one  
is appropriate. It's nice for tools that use *just* the Geonames data  
though. (But then it's not really Semantic Web ;-)

Keep it up,
Richard

[1] http://dowhatimean.net/2006/10/fixing-ambiguous-concept-uris
[2] http://www.w3.org/DesignIssues/LinkedData.html
[3] http://sites.wiwiss.fu-berlin.de/suhl/bizer/d2r-server/
[4] http://sites.wiwiss.fu-berlin.de/suhl/bizer/ng4j/semwebclient/



> Note that such hypothetical URIs actually work right now, but the  
> added parameters are not really processed and they are equivalent  
> to (2).
>
> And I guess, like in other geonames Web services, (2) would yield a  
> description with default values for parameters used in (3) and (4)  
> and the like. In that case, the description of (1) yielded by (2)  
> through 303 redirects will be neither "complete", nor "canonical",  
> nor "authoritative" ... but just a "default" description.
>
> Is this correct TAG-wise? This time I ask before ... :-)
>
>
> Bernard vatant a écrit :
>>
>> Hello all
>>
>> I'm very pleased to announce that, through a quick and efficient  
>> collaboration with Marc (in cc), the 6 million and growing  
>> geographical features in the data base of Geonames [1] are now  
>> described by a OWL ontology [2], and the RDF description of each  
>> instance, including names, type, of course geolocation elements,  
>> is now available through Geonames Webservice,  adding to an  
>> already impressive pack of  services [3].
>> The ontology is very simple, and leverage elements of the  
>> wgs84_pos vocabulary [4]. The feature types are described using a  
>> simple SKOS vocabulary, which has been embedded in the OWL ontology.
>>
>> If you add that, thanks to Google Maps API, the geonames features  
>> can be created and edited through a wiki-like interface [5], this  
>> as Web 2.0 as can be.
>>
>> Comments welcome, either here or in the Geonames forum [6]
>>
>> Bernard
>>
>>
>> [1] http://www.geonames.org
>> [2] http://www.geonames.org/ontology/
>> [3] http://www.geonames.org/export/
>> [4] http://www.w3.org/2003/01/geo/#vocabulary
>> [5] http://www.geonames.org/recent-changes/
>> [6] http://forum.geonames.org/gforum/posts/list/156.page
>>
>
> -- 
>
> *Bernard Vatant
> *Knowledge Engineering
> ----------------------------------------------------
> *Mondeca **
> *3, cité Nollez 75018 Paris France
> Web: www.mondeca.com <http://www.mondeca.com>
> ----------------------------------------------------
> Tel. +33 (0) 871 488 459 Mail: bernard.vatant@mondeca.com  
> <mailto:bernard.vatant@mondeca.com>
> Wikipedia:universimmedia <http://en.wikipedia.org/wiki/ 
> User:Universimmedia>
>
>
>
Received on Tuesday, 17 October 2006 13:17:20 UTC