Re: Updated GeoSpecies Data Set 1,765,790 Triples

Hi Peter,

I see that you have already a dataset dump available. Could I suggest 
also the use of a semantic sitemap [1], so that search engines such as 
Sindice can find, process and index your dump.

Best,

[1]http://sw.deri.org/2007/07/sitemapextension

-- 
Renaud Delbru

On 29/10/09 21:10, Peter DeVries wrote:
> I have updated the GeoSpecies data set.
>
> You can read about it here:
>
> http://about.geospecies.org/
>
> You can browse it here:
>
> http://lod.geospecies.org/
>
> The RDF dump can be obtained here:
>
> Here is the new RDF dump
>
> http://lod.geospecies.org/geospecies.rdf.tar.gz   (1,765,790 Triples)
>
> The data set currently contains information and linked data for: 
> 15,862 Species, 1,291 Familes, 206 Orders. We have approximately 6,500 
> species observations, but are awaiting release on the majority of 
> those. The current data set includes 12 sample observation records 
> with geo and geonames links. There is also a growing number of 
> GeoSpecies annotated articles and presentations in the bibtex and 
> bibio vocabularies. The knowledge base is currently linked to DBpedia, 
> Freebase, Bio2RDF, Uniprot, uBio data sources, and uses some of the 
> umbel subject concepts. See the projects page information on proper 
> attribution. Until they have been fully documented, the bulk of the 
> observation records are not currently available.
>
> I have attempted to link to dbpedia, bio2rdf, uniprot and freebase 
> when possible using skos:closeMatch. Of the 15,862 species, 5,684 are 
> linked to dbpedia and wikipedia, 8,948 are linked to bio2rdf and 
> uniprot. There are also foaf:isPrimaryTopicOf links to 8,910 
> Wikispecies pages. Similar linkages are made at the other taxonomic 
> levels of kingdom, phylum, class, order and family.
>
> Here the the page for the Silver-bordered Fritillary Butterfly Boloria 
> selene Denis and Schiffermuller 1775
>
> http://lod.geospecies.org/ses/ICmLC.html
>
> The "entity" is
>
> http://lod.geospecies.org/ses/ICmLC
>
> The RDF is
>
> http://lod.geospecies.org/ses/ICmLC.rdf
>
> The levels above species and family are in XHTML with RDFa, but also 
> have a straight RDF representation.
>
> Order Carnivora
>
> http://lod.geospecies.org/orders/jtSaY.xhtml
>
> RDF version
>
> http://lod.geospecies.org/orders/jtSaY.rdf
>
> This page has some example SPARQL queries. 
> http://about.geospecies.org/sparql.xhtml
>
> You can find the ontology documentation here: 
> http://rdf.geospecies.org/gs_ont_doc/index.html
>
> It is mainly a vocabulary, since I have had trouble getting all the 
> related ontologies to play well together.
>
> The SPARQL query examples will work as described on the RDF dataset 
> without the ontology.
>
> This is only a fraction of the world's species but it includes all the 
> world's Mammals, and North American Birds.
>
> I will be working to improve the data set's depth, breadth and 
> linkages overtime, and would appreciate any comments or suggestions :-)
>
> My long term plan is to also add biologically relevant assertions to 
> allow useful semantic queries about species.
>
> I have started to add state and county level records from the USDA 
> Plants dataset for Wisconsin, Iowa, Michigan, Minnesota.
>
> In addition, I have started to make links between habitats and species.
>
> - Pete
> ----------------------------------------------------------------
> Pete DeVries
> Department of Entomology
> University of Wisconsin - Madison
> 445 Russell Laboratories
> 1630 Linden Drive
> Madison, WI 53706
> GeoSpecies Knowledge Base
> About the GeoSpecies Knowledge Base
> ------------------------------------------------------------

Received on Thursday, 29 October 2009 23:17:27 UTC