Modeling Taxonomic Classifications in a World where a given Species can have many Classifications

Hi,

I have been trying to figure out the best way to deal with the following
problem.

There are entities that we see as "species". (some argue if they are real
things or simply an artificial human construct.)

I think that in general the species themselves see them as real and do a
pretty good job identifying other members of the same species.

Putting that entire debate aside, we still need some way to deal with the
idea of a species as a typological construct so one can say things like.

This *species* was observed at this geolocation or There have been X number
of bird *species* observed in this natural area.

Names change over time, and the same name string can be used for different
animal / plant species.

So that is why I created LOD entities like these

http://lod.taxonconcept.org/ses/iuCXz.html  (
http://lod.taxonconcept.org/ses/iuCXz.rdf )

http://lod.taxonconcept.org/ses/v6n7p.html  (
http://lod.taxonconcept.org/ses/v6n7p.rdf )

Since moving to this new model from my earlier GeoSpecies, I have been
trying to figure out how to deal with the following issue.

A species can have multiple classifications. You can see this when you
compare many of the species in DBpedia to those in the NCBI taxonomy
(uniprot, bio2rdf)

Uniprot and Bio2RDF model these as nested subclasses which makes correct
linking error prone.

I think a better way to think of this: *there are species* and *different
groups choose to organise them into classifications differently*.

So rather than organize these into nested subclasses, I am thinking about
the following pattern.

Puma concolor
txn:inGenus txn_mammalia_genera:Genus_Puma
txn:inFamily txn_mammalia:Family_Felidae
txn:inOrder  txn_mammalia:Order_Carnivora

You can see this in this file
http://lod.taxonconcept.org/ontology/p01/Mammalia/species.owl

    <owl:Class rdf:about="http://lod.taxonconcept.org/ses/v6n7p#Species">
     <txn:inClass rdf:resource="
http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Class_Mammalia
"/>
     <txn:inOrder rdf:resource="
http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Order_Carnivora
"/>
     <txn:inFamily rdf:resource="
http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Family_Felidae
"/>
     <txn:inGenus rdf:resource="
http://lod.taxonconcept.org/ontology/p01/Mammalia/genera.owl#Genus_Puma"/>
     <rdfs:isDefinedBy rdf:resource="
http://lod.taxonconcept.org/ontology/p01/Mammalia/species.owl"/>
    </owl:Class>

And here http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl

    <owl:Class rdf:about="
http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Family_Felidae">
        <rdfs:label>Family Felidae</rdfs:label>
        <rdf:type rdf:resource="
http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Mammal_Family"/>
        <txn:commonName>Cats</txn:commonName>
        <skos:closeMatch rdf:resource="http://purl.uniprot.org/taxonomy/9681
"/>
        <skos:closeMatch rdf:resource="http://dbpedia.org/resource/Felidae
"/>
        <txn:hasWikipediaArticle rdf:resource="
http://en.wikipedia.org/wiki/Felidae"/>
        <skos:broader rdf:resource="
http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Suborder_Feliformia
"/>
        <skos:narrower rdf:resource="
http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Subfamily_Felinae
"/>
        <skos:narrower rdf:resource="
http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Subfamily_Pantherinae
"/>
        <owl:sameAs rdf:resource="http://lod.geospecies.org/families/gSvIP
"/>
        <rdfs:isDefinedBy rdf:resource="
http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl"/>
    </owl:Class>

This allows SPARQL queries like the one here http://bit.ly/qssZOG based on
my classification without breaking queries where the link is to DBpedia via
a different predicate.

For now I have simply linked these broadly to DBpedia using the following

<txn:inDBpediaClade rdf:resource="http://dbpedia.org/ontology/Mammal"/> *I
use clade because these don't always match Order => Order etc.

I think this pattern allows a given species to exist in several
classifications, and allow those interested to move up and down the
taxonomy - all without breaking things in the LOD.

I thought I would ask the list what they thought of this before I do much
more?

I was also wondering if it would it be better for me to use subproperties
of skos that I have created in this draft ontology?

http://lod.taxonconcept.org/ontology/taxnomen/index.owl

Such as:
 txn_nomen:narrowerTaxon
 txn_nomen:broaderTaxon

Which would be used this way

    <owl:Class rdf:about="
http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Family_Felidae">
        <rdfs:label>Family Felidae</rdfs:label>
        <rdf:type rdf:resource="
http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Mammal_Family"/>
        <txn:commonName>Cats</txn:commonName>
        <skos:closeMatch rdf:resource="http://purl.uniprot.org/taxonomy/9681
"/>
        <skos:closeMatch rdf:resource="http://dbpedia.org/resource/Felidae
"/>
        <txn:hasWikipediaArticle rdf:resource="
http://en.wikipedia.org/wiki/Felidae"/>
        <txn_nomen:broaderTaxon rdf:resource="
http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Suborder_Feliformia
"/>
        <txn_nomen:narrowerTaxon rdf:resource="
http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Subfamily_Felinae
"/>
        <txn_nomen:narrowerTaxon rdf:resource="
http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Subfamily_Pantherinae
"/>
        <owl:sameAs rdf:resource="http://lod.geospecies.org/families/gSvIP
"/>
        <rdfs:isDefinedBy rdf:resource="
http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl"/>
    </owl:Class>


And
txn_nomen:narrowerRank
txn_nomen:broaderRank

Which is used this way

    <owl:Class rdf:about="
http://lod.taxonconcept.org/ontology/taxnomen/index.owl#Rank_Family">
        <rdfs:label xml:lang="en">Rank Family</rdfs:label>
        <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Thing"/>
        <rdf:type rdf:resource="
http://lod.taxonconcept.org/ontology/taxnomen/index.owl#TaxonRank"/>
        <txn_nomen:narrowerRank rdf:resource="
http://lod.taxonconcept.org/ontology/taxnomen/index.owl#Subfamily"/>
        <txn_nomen:broaderRank rdf:resource="
http://lod.taxonconcept.org/ontology/taxnomen/index.owl#Superfamily"/>
        <owl:equivalentProperty rdf:resource="
http://purl.org/ontology/wo/Family"/>
        <rdfs:seeAlso rdf:resource="
http://en.wikipedia.org/wiki/Family_%28biology%29"/>
        <rdfs:seeAlso rdf:resource="http://www.bbc.co.uk/nature/family"/>
        <vs:term_status>testing</vs:term_status>
       <rdfs:isDefinedBy rdf:resource="
http://lod.taxonconcept.org/ontology/taxnomen/index.owl#index.owl"/>
    </owl:Class>

Respectfully,

- Pete

P.S. Taxonomic Classification Ontologies like the ones listed above for
mammals will change over time as additional species are discovered and
their phylogeny is better understood.
        What would be the best practices to handle things like this?


-- 
------------------------------------------------------------------------------------
Pete DeVries
Department of Entomology
University of Wisconsin - Madison
445 Russell Laboratories
1630 Linden Drive
Madison, WI 53706
Email: pdevries@wisc.edu
TaxonConcept <http://www.taxonconcept.org/>  &
GeoSpecies<http://about.geospecies.org/> Knowledge
Bases
A Semantic Web, Linked Open Data <http://linkeddata.org/>  Project
--------------------------------------------------------------------------------------

Received on Wednesday, 25 January 2012 22:27:49 UTC