- From: Paul Wilton <paul.wilton@ontoba.com>
- Date: Thu, 26 Jan 2012 16:44:35 +0000
- To: Peter DeVries <pete.devries@gmail.com>
- Cc: Jerven Bolleman <jerven.bolleman@isb-sib.ch>, public-lod@w3.org, Anne Thessen <athessen@mbl.edu>
- Message-ID: <CALer3uYtEFPOPVvk=YRD=dWzpdHnvuE14F7vkuuGftn9cRb-zA@mail.gmail.com>
Hi Peter doesn't your problem still exist using skos ? - use of skos:broader to infer a hierarchy doesn't stop users making sameAs relationships between two concepts at different depths of your taxonomy, and thus creating the same problem for you defined in skos rather than a class hierarchy ? - also one thing to note with skos is that it is triple heavy having both inverseOf relationships and deepish property inheritance (broaderTransitive <= semanticRelation - this means in a owl inferencing triple store you will materialise something like n * n * 6 triples (taxon width * depth * skos properties) - which could turn out to be very large if your dataset/taxonomy is deep, as I imagine it is quite wide ( a few million species?)... may not be a problem - but thought worth mentioning sounds like a great project though :) kind regards Paul Paul Wilton, Technical Architect Ontoba Ltd <http://www.ontoba.com> paul.wilton@ontoba.com On Thu, Jan 26, 2012 at 3:39 PM, Peter DeVries <pete.devries@gmail.com>wrote: > Hi Jerven, > > Thank you for your response. Your reasoning makes sense to me and I like > the move to skos:broader and skos:narrower. > > The problem that I have with subClassing is that some groups have made > sameAs links between *txn* concepts and subclassed concepts. > > This then entails the *txn* concepts within their subClass hierarchy in > the LOD. > > So it is in this regard that I see them as potentially error prone. > > Ideally this linking should be done with something similar to SameAs but > without entailment. > > For now I think the best alternative is skos:closeMatch. > > In some use cases these two linked entities can be interpreted as the same > thing, but for other uses it might be best to consider them "different > things". > > Until there are more nuanced versions of sameAs I think that > skos:closeMatch allow end users to treat these linked entities as they see > fit. > > I am glad you wrote and I would like to follow up in the future. > > I am currently in Woods Hole MA working with the EoL.org and > GlobalNames.org and so there might be some opportunities to that we could > run past your group. > > Also note that the species concepts are still experimental and would > probably benefit from your suggestions. > > Thanks Again, > > - Pete > > > > On Thu, Jan 26, 2012 at 7:46 AM, Jerven Bolleman < > jerven.bolleman@isb-sib.ch> wrote: > >> Hi Peter, >> >> Its interesting to see this discussion. I would like to give a short >> background on why we at UniProt used rdfs:subClassOf relations between >> taxons ids. >> When this decision was made there where no property paths yet but there >> was RDFS inferencing. So the only way one could query for all bacterial >> proteins is by having each bacteria species being a subClassOf the bacteria >> kingdom thing. >> e.g. select ?protein where {?protein :organism ?taxon . ?taxon >> rdfs:subClassOf taxon:2} >> >> Now that there are property paths we no longer need RDFS inferencing to >> answer these kinds of questions. >> i.e. select ?protein where {?protein :organism ?taxon . ?taxon >> skos:broader+ taxon:2} >> >> We could actually move away from using rdfs:subClassOf. If we have a good >> use case of this. >> You can actually see that in this release of UniProt where we introduced >> skos:narrower into the taxonomy relations next release we will add the >> skos:broader links. >> >> Peter, I am baffled by one statement you made: why does the use of >> rdfs:subClassOf relations make correct linking error prone? >> >> Regards, >> Jerven Bolleman >> >> On Jan 25, 2012, at 11:27 PM, Peter DeVries wrote: >> >> > Hi, >> > >> > I have been trying to figure out the best way to deal with the >> following problem. >> > >> > There are entities that we see as "species". (some argue if they are >> real things or simply an artificial human construct.) >> > >> > I think that in general the species themselves see them as real and do >> a pretty good job identifying other members of the same species. >> > >> > Putting that entire debate aside, we still need some way to deal with >> the idea of a species as a typological construct so one can say things like. >> > >> > This species was observed at this geolocation or There have been X >> number of bird species observed in this natural area. >> > >> > Names change over time, and the same name string can be used for >> different animal / plant species. >> > >> > So that is why I created LOD entities like these >> > >> > http://lod.taxonconcept.org/ses/iuCXz.html ( >> http://lod.taxonconcept.org/ses/iuCXz.rdf ) >> > >> > http://lod.taxonconcept.org/ses/v6n7p.html ( >> http://lod.taxonconcept.org/ses/v6n7p.rdf ) >> > >> > Since moving to this new model from my earlier GeoSpecies, I have been >> trying to figure out how to deal with the following issue. >> > >> > A species can have multiple classifications. You can see this when you >> compare many of the species in DBpedia to those in the NCBI taxonomy >> (uniprot, bio2rdf) >> > >> > Uniprot and Bio2RDF model these as nested subclasses which makes >> correct linking error prone. >> > >> > I think a better way to think of this: there are species and different >> groups choose to organise them into classifications differently. >> > >> > So rather than organize these into nested subclasses, I am thinking >> about the following pattern. >> > >> > Puma concolor >> > txn:inGenus txn_mammalia_genera:Genus_Puma >> > txn:inFamily txn_mammalia:Family_Felidae >> > txn:inOrder txn_mammalia:Order_Carnivora >> > >> > You can see this in this file >> http://lod.taxonconcept.org/ontology/p01/Mammalia/species.owl >> > >> > <owl:Class rdf:about="http://lod.taxonconcept.org/ses/v6n7p#Species >> "> >> > <txn:inClass rdf:resource=" >> http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Class_Mammalia >> "/> >> > <txn:inOrder rdf:resource=" >> http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Order_Carnivora >> "/> >> > <txn:inFamily rdf:resource=" >> http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Family_Felidae >> "/> >> > <txn:inGenus rdf:resource=" >> http://lod.taxonconcept.org/ontology/p01/Mammalia/genera.owl#Genus_Puma >> "/> >> > <rdfs:isDefinedBy rdf:resource=" >> http://lod.taxonconcept.org/ontology/p01/Mammalia/species.owl"/> >> > </owl:Class> >> > >> > And here http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl >> > >> > <owl:Class rdf:about=" >> http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Family_Felidae >> "> >> > <rdfs:label>Family Felidae</rdfs:label> >> > <rdf:type rdf:resource=" >> http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Mammal_Family >> "/> >> > <txn:commonName>Cats</txn:commonName> >> > <skos:closeMatch rdf:resource=" >> http://purl.uniprot.org/taxonomy/9681"/> >> > <skos:closeMatch rdf:resource=" >> http://dbpedia.org/resource/Felidae"/> >> > <txn:hasWikipediaArticle rdf:resource=" >> http://en.wikipedia.org/wiki/Felidae"/> >> > <skos:broader rdf:resource=" >> http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Suborder_Feliformia >> "/> >> > <skos:narrower rdf:resource=" >> http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Subfamily_Felinae >> "/> >> > <skos:narrower rdf:resource=" >> http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Subfamily_Pantherinae >> "/> >> > <owl:sameAs rdf:resource=" >> http://lod.geospecies.org/families/gSvIP"/> >> > <rdfs:isDefinedBy rdf:resource=" >> http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl"/> >> > </owl:Class> >> > >> > This allows SPARQL queries like the one here http://bit.ly/qssZOGbased on my classification without breaking queries where the link is to >> DBpedia via a different predicate. >> > >> > For now I have simply linked these broadly to DBpedia using the >> following >> > >> > <txn:inDBpediaClade rdf:resource="http://dbpedia.org/ontology/Mammal"/> >> *I use clade because these don't always match Order => Order etc. >> > >> > I think this pattern allows a given species to exist in several >> classifications, and allow those interested to move up and down the >> taxonomy - all without breaking things in the LOD. >> > >> > I thought I would ask the list what they thought of this before I do >> much more? >> > >> > I was also wondering if it would it be better for me to use >> subproperties of skos that I have created in this draft ontology? >> > >> > http://lod.taxonconcept.org/ontology/taxnomen/index.owl >> > >> > Such as: >> > txn_nomen:narrowerTaxon >> > txn_nomen:broaderTaxon >> > >> > Which would be used this way >> > >> > <owl:Class rdf:about=" >> http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Family_Felidae >> "> >> > <rdfs:label>Family Felidae</rdfs:label> >> > <rdf:type rdf:resource=" >> http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Mammal_Family >> "/> >> > <txn:commonName>Cats</txn:commonName> >> > <skos:closeMatch rdf:resource=" >> http://purl.uniprot.org/taxonomy/9681"/> >> > <skos:closeMatch rdf:resource=" >> http://dbpedia.org/resource/Felidae"/> >> > <txn:hasWikipediaArticle rdf:resource=" >> http://en.wikipedia.org/wiki/Felidae"/> >> > <txn_nomen:broaderTaxon rdf:resource=" >> http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Suborder_Feliformia >> "/> >> > <txn_nomen:narrowerTaxon rdf:resource=" >> http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Subfamily_Felinae >> "/> >> > <txn_nomen:narrowerTaxon rdf:resource=" >> http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl#Subfamily_Pantherinae >> "/> >> > <owl:sameAs rdf:resource=" >> http://lod.geospecies.org/families/gSvIP"/> >> > <rdfs:isDefinedBy rdf:resource=" >> http://lod.taxonconcept.org/ontology/p01/Mammalia/index.owl"/> >> > </owl:Class> >> > >> > >> > And >> > txn_nomen:narrowerRank >> > txn_nomen:broaderRank >> > >> > Which is used this way >> > >> > <owl:Class rdf:about=" >> http://lod.taxonconcept.org/ontology/taxnomen/index.owl#Rank_Family"> >> > <rdfs:label xml:lang="en">Rank Family</rdfs:label> >> > <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Thing"/> >> > <rdf:type rdf:resource=" >> http://lod.taxonconcept.org/ontology/taxnomen/index.owl#TaxonRank"/> >> > <txn_nomen:narrowerRank rdf:resource=" >> http://lod.taxonconcept.org/ontology/taxnomen/index.owl#Subfamily"/> >> > <txn_nomen:broaderRank rdf:resource=" >> http://lod.taxonconcept.org/ontology/taxnomen/index.owl#Superfamily"/> >> > <owl:equivalentProperty rdf:resource=" >> http://purl.org/ontology/wo/Family"/> >> > <rdfs:seeAlso rdf:resource=" >> http://en.wikipedia.org/wiki/Family_%28biology%29"/> >> > <rdfs:seeAlso rdf:resource="http://www.bbc.co.uk/nature/family >> "/> >> > <vs:term_status>testing</vs:term_status> >> > <rdfs:isDefinedBy rdf:resource=" >> http://lod.taxonconcept.org/ontology/taxnomen/index.owl#index.owl"/> >> > </owl:Class> >> > >> > Respectfully, >> > >> > - Pete >> > >> > P.S. Taxonomic Classification Ontologies like the ones listed above for >> mammals will change over time as additional species are discovered and >> their phylogeny is better understood. >> > What would be the best practices to handle things like this? >> > >> > >> > -- >> > >> ------------------------------------------------------------------------------------ >> > Pete DeVries >> > Department of Entomology >> > University of Wisconsin - Madison >> > 445 Russell Laboratories >> > 1630 Linden Drive >> > Madison, WI 53706 >> > Email: pdevries@wisc.edu >> > TaxonConcept & GeoSpecies Knowledge Bases >> > A Semantic Web, Linked Open Data Project >> > >> -------------------------------------------------------------------------------------- >> >> > > > -- > > ------------------------------------------------------------------------------------ > Pete DeVries > Department of Entomology > University of Wisconsin - Madison > 445 Russell Laboratories > 1630 Linden Drive > Madison, WI 53706 > Email: pdevries@wisc.edu > TaxonConcept <http://www.taxonconcept.org/> & GeoSpecies<http://about.geospecies.org/> Knowledge > Bases > A Semantic Web, Linked Open Data <http://linkeddata.org/> Project > > -------------------------------------------------------------------------------------- >
Received on Thursday, 26 January 2012 16:45:07 UTC