- From: Peter DeVries <pete.devries@gmail.com>
- Date: Tue, 11 Jun 2013 03:19:32 -0400
- To: "public-lod@w3.org" <public-lod@w3.org>
- Message-ID: <CAE0MQeGmA84sCO=3CrJBC-Ous1zKX80whPAbOp=QLyooVB=pTQ@mail.gmail.com>
Hi, I thought I would announce that I have a new TaxonConcept data set that includes millions of entries from an EoL NLP project. This was many about annotating the text corpus but the data set includes a lot of photos and additional data. IIt consists of 1,141,247 data objects: http://lsd.taxonconcept.org/describe/?url=http://eol.taxonconcept.org/dos/f16694ff9d02b8288f096bf58f876c0e and ~404,302 taxa like the Honey Bee http://lsd.taxonconcept.org/describe/?url=http://eol.taxonconcept.org/txn/1045608 The URI's deference to Turtle http://eol.taxonconcept.org/txn/213726.ttl * I just realized that some of the data objects will not resolve because I ran out of inodes, but they are in the RDF dump - that will be fixed in the future. The sitemap is here lod.taxonconcept.org/sitemap.xml The void is here http://lod.taxonconcept.org/ontology/void.rdf *The interlinking is not fully documented. The whole download includes both the TaxonConcept and EoL data of 37,700,616 Triples. The short description of how the NLP works is if a taxon name is detected in the text object then the data object Turtle is marked up with links to both the name and the taxon URI. It is assumed that if a body of text about the Cougar contains other taxon names then there is some "relationship" between these two taxa and they are connected with dcterms:relation. I have some examples showing how this works at this bit.ly bundle http://bitly.com/bundles/pjdlinkeddata/r -- Pete ------------------------------- Pete DeVries Semantic Web / Linked Open Data Marine Biological Laboratory 7 MBL Street Woods Hole MA 02543 Email: pdevries@mbl.edu University of Wisconsin - Madison Entomology TaxonConcept <http://www.taxonconcept.org/> / GeoSpecies Knowledge Bases Email: pdevries@wisc.edu -------------------------------------------------------------------------------------
Received on Tuesday, 11 June 2013 07:20:00 UTC