W3C home > Mailing lists > Public > public-lod@w3.org > October 2009

Re: Updated GeoSpecies Data Set 1,765,790 Triples

From: Michael Hausenblas <michael.hausenblas@deri.org>
Date: Fri, 30 Oct 2009 08:10:06 +0000
To: Peter DeVries <pete.devries@gmail.com>
CC: Linked Data community <public-lod@w3.org>
Message-ID: <C7104FDE.9CC6%michael.hausenblas@deri.org>

Peter,

Great work! Now, as you already have a semantic sitemap, it should be pretty
straight-forward to offer a voiD description [1] of your dataset as well.
You can, for example, use the lifting XSLT [2] to bootstrap the voiD file
from your existing sitemap and add further details about the interlinking to
other datasets, etc. (see also the voiD guide [3] for the interplay with
sitemaps).

Happy to assist you off-line in case you have further questions regarding
voiD ...


Cheers,
      Michael

[1] http://semanticweb.org/wiki/VoiD
[2] 
http://code.google.com/p/void-impl/source/browse/trunk/liftSSM/SSM2void.xslt
[3] http://rdfs.org/ns/void-guide#sec_5_2_Discovery_via_sitemaps

-- 
Dr. Michael Hausenblas
LiDRC - Linked Data Research Centre
DERI - Digital Enterprise Research Institute
NUIG - National University of Ireland, Galway
Ireland, Europe
Tel. +353 91 495730
http://linkeddata.deri.ie/
http://sw-app.org/about.html



> From: Peter DeVries <pete.devries@gmail.com>
> Date: Thu, 29 Oct 2009 20:40:44 -0500
> To: Renaud Delbru <renaud.delbru@deri.org>
> Cc: Linked Data community <public-lod@w3.org>
> Subject: Re: Updated GeoSpecies Data Set 1,765,790 Triples
> Resent-From: Linked Data community <public-lod@w3.org>
> Resent-Date: Fri, 30 Oct 2009 01:41:27 +0000
> 
> Hi Renaud,
> 
> Thank you, I have a semantic sitemap at:
> 
> http://lod.geospecies.org/sitemap.xml
> 
> I am open to additional comments or suggestions. :-)
> 
> <http://lod.geospecies.org/sitemap.xml>- Pete
> 
> On Thu, Oct 29, 2009 at 6:16 PM, Renaud Delbru <renaud.delbru@deri.org>wrote:
> 
>> Hi Peter,
>> 
>> I see that you have already a dataset dump available. Could I suggest also
>> the use of a semantic sitemap [1], so that search engines such as Sindice
>> can find, process and index your dump.
>> 
>> Best,
>> 
>> [1]http://sw.deri.org/2007/07/sitemapextension
>> 
>> --
>> Renaud Delbru
>> 
>> 
>> On 29/10/09 21:10, Peter DeVries wrote:
>> 
>>> I have updated the GeoSpecies data set.
>>> 
>>> You can read about it here:
>>> 
>>> http://about.geospecies.org/
>>> 
>>> You can browse it here:
>>> 
>>> http://lod.geospecies.org/
>>> 
>>> The RDF dump can be obtained here:
>>> 
>>> Here is the new RDF dump
>>> 
>>> http://lod.geospecies.org/geospecies.rdf.tar.gz   (1,765,790 Triples)
>>> 
>>> The data set currently contains information and linked data for: 15,862
>>> Species, 1,291 Familes, 206 Orders. We have approximately 6,500 species
>>> observations, but are awaiting release on the majority of those. The current
>>> data set includes 12 sample observation records with geo and geonames links.
>>> There is also a growing number of GeoSpecies annotated articles and
>>> presentations in the bibtex and bibio vocabularies. The knowledge base is
>>> currently linked to DBpedia, Freebase, Bio2RDF, Uniprot, uBio data sources,
>>> and uses some of the umbel subject concepts. See the projects page
>>> information on proper attribution. Until they have been fully documented,
>>> the bulk of the observation records are not currently available.
>>> 
>>> I have attempted to link to dbpedia, bio2rdf, uniprot and freebase when
>>> possible using skos:closeMatch. Of the 15,862 species, 5,684 are linked to
>>> dbpedia and wikipedia, 8,948 are linked to bio2rdf and uniprot. There are
>>> also foaf:isPrimaryTopicOf links to 8,910 Wikispecies pages. Similar
>>> linkages are made at the other taxonomic levels of kingdom, phylum, class,
>>> order and family.
>>> 
>>> Here the the page for the Silver-bordered Fritillary Butterfly Boloria
>>> selene Denis and Schiffermuller 1775
>>> 
>>> http://lod.geospecies.org/ses/ICmLC.html
>>> 
>>> The "entity" is
>>> 
>>> http://lod.geospecies.org/ses/ICmLC
>>> 
>>> The RDF is
>>> 
>>> http://lod.geospecies.org/ses/ICmLC.rdf
>>> 
>>> The levels above species and family are in XHTML with RDFa, but also have
>>> a straight RDF representation.
>>> 
>>> Order Carnivora
>>> 
>>> http://lod.geospecies.org/orders/jtSaY.xhtml
>>> 
>>> RDF version
>>> 
>>> http://lod.geospecies.org/orders/jtSaY.rdf
>>> 
>>> This page has some example SPARQL queries.
>>> http://about.geospecies.org/sparql.xhtml
>>> 
>>> You can find the ontology documentation here:
>>> http://rdf.geospecies.org/gs_ont_doc/index.html
>>> 
>>> It is mainly a vocabulary, since I have had trouble getting all the
>>> related ontologies to play well together.
>>> 
>>> The SPARQL query examples will work as described on the RDF dataset
>>> without the ontology.
>>> 
>>> This is only a fraction of the world's species but it includes all the
>>> world's Mammals, and North American Birds.
>>> 
>>> I will be working to improve the data set's depth, breadth and linkages
>>> overtime, and would appreciate any comments or suggestions :-)
>>> 
>>> My long term plan is to also add biologically relevant assertions to allow
>>> useful semantic queries about species.
>>> 
>>> I have started to add state and county level records from the USDA Plants
>>> dataset for Wisconsin, Iowa, Michigan, Minnesota.
>>> 
>>> In addition, I have started to make links between habitats and species.
>>> 
>>> - Pete
>>> ----------------------------------------------------------------
>>> Pete DeVries
>>> Department of Entomology
>>> University of Wisconsin - Madison
>>> 445 Russell Laboratories
>>> 1630 Linden Drive
>>> Madison, WI 53706
>>> GeoSpecies Knowledge Base
>>> About the GeoSpecies Knowledge Base
>>> ------------------------------------------------------------
>>> 
>> 
>> 
> 
> 
> -- 
> ----------------------------------------------------------------
> Pete DeVries
> Department of Entomology
> University of Wisconsin - Madison
> 445 Russell Laboratories
> 1630 Linden Drive
> Madison, WI 53706
> GeoSpecies Knowledge Base
> About the GeoSpecies Knowledge Base
> ------------------------------------------------------------
Received on Friday, 30 October 2009 08:10:49 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:24 UTC