Re: Modeling Taxonomic Classifications in a World where a given Species can have many Classifications

Hi Jerven,

Thank you for your response. Your reasoning makes sense to me and I like
the move to skos:broader and skos:narrower.

The problem that I have with subClassing is that some groups have made
sameAs links between *txn* concepts and subclassed concepts.

This then entails the *txn* concepts within their subClass hierarchy in the
LOD.

So it is in this regard that I see them as potentially error prone.

Ideally this linking should be done with something similar to SameAs but
without entailment.

For now I think the best alternative is skos:closeMatch.

In some use cases these two linked entities can be interpreted as the same
thing, but for other uses it might be best to consider them "different
things".

Until there are more nuanced versions of sameAs I think that
skos:closeMatch allow end users to treat these linked entities as they see
fit.

I am glad you wrote and I would like to follow up in the future.

I am currently in Woods Hole MA working with the EoL.org and
GlobalNames.org and so there might be some opportunities to that we could
run past your group.

Also note that the species concepts are still experimental and would
probably benefit from your suggestions.

Thanks Again,

- Pete



On Thu, Jan 26, 2012 at 7:46 AM, Jerven Bolleman <jerven.bolleman@isb-sib.ch
> wrote:

> Hi Peter,
>
> Its interesting to see this discussion. I would like to give a short
> background on why we at UniProt used rdfs:subClassOf relations between
> taxons ids.
> When this decision was made there where no property paths yet but there
> was RDFS inferencing. So the only way one could query for all bacterial
> proteins is by having each bacteria species being a subClassOf the bacteria
> kingdom thing.
> e.g. select ?protein where {?protein :organism ?taxon . ?taxon
> rdfs:subClassOf taxon:2}
>
> Now that there are property paths we no longer need RDFS inferencing to
> answer these kinds of questions.
> i.e. select ?protein where {?protein :organism ?taxon . ?taxon
> skos:broader+ taxon:2}
>
> We could actually move away from using rdfs:subClassOf. If we have a good
> use case of this.
> You can actually see that in this release of UniProt where we introduced
> skos:narrower into the taxonomy relations next release we will add the
> skos:broader links.
>
> Peter, I am baffled by one statement you made: why does the use of
> rdfs:subClassOf relations make correct linking error prone?
>
> Regards,
> Jerven Bolleman
>
> On Jan 25, 2012, at 11:27 PM, Peter DeVries wrote:
>
> > Hi,
> >
> > I have been trying to figure out the best way to deal with the following
> problem.
> >
> > There are entities that we see as "species". (some argue if they are
> real things or simply an artificial human construct.)
> >
> > I think that in general the species themselves see them as real and do a
> pretty good job identifying other members of the same species.
> >
> > Putting that entire debate aside, we still need some way to deal with
> the idea of a species as a typological construct so one can say things like.
> >
> > This species was observed at this geolocation or There have been X
> number of bird species observed in this natural area.
> >
> > Names change over time, and the same name string can be used for
> different animal / plant species.
> >
> > So that is why I created LOD entities like these
> >
> > http://lod.taxonconcept.org/ses/iuCXz.html  (
> http://lod.taxonconcept.org/ses/iuCXz.rdf )
> >
> > http://lod.taxonconcept.org/ses/v6n7p.html  (
> http://lod.taxonconcept.org/ses/v6n7p.rdf )
> >
> > Since moving to this new model from my earlier GeoSpecies, I have been
> trying to figure out how to deal with the following issue.
> >
> > A species can have multiple classifications. You can see this when you
> compare many of the species in DBpedia to those in the NCBI taxonomy
> (uniprot, bio2rdf)
> >
> > Uniprot and Bio2RDF model these as nested subclasses which makes correct
> linking error prone.
> >
> > I think a better way to think of this: there are species and different
> groups choose to organise them into classifications differently.
> >
> > So rather than organize these into nested subclasses, I am thinking
> about the following pattern.
> >
> > Puma concolor
> > txn:inGenus txn_mammalia_genera:Genus_Puma
> > txn:inFamily txn_mammalia:Family_Felidae
> > txn:inOrder  txn_mammalia:Order_Carnivora
> >
> > You can see this in this file
> http://lod.taxonconcept.org/ontology/p01/Mammalia/species.owl
> >
> >     <owl:Class rdf:about="http://lod.taxonconcept.org/ses/v6n7p#Species
> ">
> >      <txn:inClass rdf:resource="
> http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Class_Mammalia
> "/>
> >      <txn:inOrder rdf:resource="
> http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Order_Carnivora
> "/>
> >      <txn:inFamily rdf:resource="
> http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Family_Felidae
> "/>
> >      <txn:inGenus rdf:resource="
> http://lod.taxonconcept.org/ontology/p01/Mammalia/genera.owl#Genus_Puma"/>
> >      <rdfs:isDefinedBy rdf:resource="
> http://lod.taxonconcept.org/ontology/p01/Mammalia/species.owl"/>
> >     </owl:Class>
> >
> > And here http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl
> >
> >     <owl:Class rdf:about="
> http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Family_Felidae
> ">
> >         <rdfs:label>Family Felidae</rdfs:label>
> >         <rdf:type rdf:resource="
> http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Mammal_Family
> "/>
> >         <txn:commonName>Cats</txn:commonName>
> >         <skos:closeMatch rdf:resource="
> http://purl.uniprot.org/taxonomy/9681"/>
> >         <skos:closeMatch rdf:resource="
> http://dbpedia.org/resource/Felidae"/>
> >         <txn:hasWikipediaArticle rdf:resource="
> http://en.wikipedia.org/wiki/Felidae"/>
> >         <skos:broader rdf:resource="
> http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Suborder_Feliformia
> "/>
> >         <skos:narrower rdf:resource="
> http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Subfamily_Felinae
> "/>
> >         <skos:narrower rdf:resource="
> http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Subfamily_Pantherinae
> "/>
> >         <owl:sameAs rdf:resource="
> http://lod.geospecies.org/families/gSvIP"/>
> >         <rdfs:isDefinedBy rdf:resource="
> http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl"/>
> >     </owl:Class>
> >
> > This allows SPARQL queries like the one here http://bit.ly/qssZOG based
> on my classification without breaking queries where the link is to DBpedia
> via a different predicate.
> >
> > For now I have simply linked these broadly to DBpedia using the following
> >
> > <txn:inDBpediaClade rdf:resource="http://dbpedia.org/ontology/Mammal"/>
> *I use clade because these don't always match Order => Order etc.
> >
> > I think this pattern allows a given species to exist in several
> classifications, and allow those interested to move up and down the
> taxonomy - all without breaking things in the LOD.
> >
> > I thought I would ask the list what they thought of this before I do
> much more?
> >
> > I was also wondering if it would it be better for me to use
> subproperties of skos that I have created in this draft ontology?
> >
> > http://lod.taxonconcept.org/ontology/taxnomen/index.owl
> >
> > Such as:
> >  txn_nomen:narrowerTaxon
> >  txn_nomen:broaderTaxon
> >
> > Which would be used this way
> >
> >     <owl:Class rdf:about="
> http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Family_Felidae
> ">
> >         <rdfs:label>Family Felidae</rdfs:label>
> >         <rdf:type rdf:resource="
> http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Mammal_Family
> "/>
> >         <txn:commonName>Cats</txn:commonName>
> >         <skos:closeMatch rdf:resource="
> http://purl.uniprot.org/taxonomy/9681"/>
> >         <skos:closeMatch rdf:resource="
> http://dbpedia.org/resource/Felidae"/>
> >         <txn:hasWikipediaArticle rdf:resource="
> http://en.wikipedia.org/wiki/Felidae"/>
> >         <txn_nomen:broaderTaxon rdf:resource="
> http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Suborder_Feliformia
> "/>
> >         <txn_nomen:narrowerTaxon rdf:resource="
> http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Subfamily_Felinae
> "/>
> >         <txn_nomen:narrowerTaxon rdf:resource="
> http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Subfamily_Pantherinae
> "/>
> >         <owl:sameAs rdf:resource="
> http://lod.geospecies.org/families/gSvIP"/>
> >         <rdfs:isDefinedBy rdf:resource="
> http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl"/>
> >     </owl:Class>
> >
> >
> > And
> > txn_nomen:narrowerRank
> > txn_nomen:broaderRank
> >
> > Which is used this way
> >
> >     <owl:Class rdf:about="
> http://lod.taxonconcept.org/ontology/taxnomen/index.owl#Rank_Family">
> >         <rdfs:label xml:lang="en">Rank Family</rdfs:label>
> >         <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Thing"/>
> >         <rdf:type rdf:resource="
> http://lod.taxonconcept.org/ontology/taxnomen/index.owl#TaxonRank"/>
> >         <txn_nomen:narrowerRank rdf:resource="
> http://lod.taxonconcept.org/ontology/taxnomen/index.owl#Subfamily"/>
> >         <txn_nomen:broaderRank rdf:resource="
> http://lod.taxonconcept.org/ontology/taxnomen/index.owl#Superfamily"/>
> >         <owl:equivalentProperty rdf:resource="
> http://purl.org/ontology/wo/Family"/>
> >         <rdfs:seeAlso rdf:resource="
> http://en.wikipedia.org/wiki/Family_%28biology%29"/>
> >         <rdfs:seeAlso rdf:resource="http://www.bbc.co.uk/nature/family
> "/>
> >         <vs:term_status>testing</vs:term_status>
> >        <rdfs:isDefinedBy rdf:resource="
> http://lod.taxonconcept.org/ontology/taxnomen/index.owl#index.owl"/>
> >     </owl:Class>
> >
> > Respectfully,
> >
> > - Pete
> >
> > P.S. Taxonomic Classification Ontologies like the ones listed above for
> mammals will change over time as additional species are discovered and
> their phylogeny is better understood.
> >         What would be the best practices to handle things like this?
> >
> >
> > --
> >
> ------------------------------------------------------------------------------------
> > Pete DeVries
> > Department of Entomology
> > University of Wisconsin - Madison
> > 445 Russell Laboratories
> > 1630 Linden Drive
> > Madison, WI 53706
> > Email: pdevries@wisc.edu
> > TaxonConcept  &  GeoSpecies Knowledge Bases
> > A Semantic Web, Linked Open Data  Project
> >
> --------------------------------------------------------------------------------------
>
>


-- 
------------------------------------------------------------------------------------
Pete DeVries
Department of Entomology
University of Wisconsin - Madison
445 Russell Laboratories
1630 Linden Drive
Madison, WI 53706
Email: pdevries@wisc.edu
TaxonConcept <http://www.taxonconcept.org/>  &
GeoSpecies<http://about.geospecies.org/> Knowledge
Bases
A Semantic Web, Linked Open Data <http://linkeddata.org/>  Project
--------------------------------------------------------------------------------------

Received on Thursday, 26 January 2012 15:40:34 UTC