W3C home > Mailing lists > Public > public-bioschemas@w3.org > February 2020

Re: Next step for biodiversity terms

From: Quentin Groom <quentin.groom@plantentuinmeise.be>
Date: Sat, 15 Feb 2020 11:14:08 +0100
Message-ID: <CALr=EE2AuSHJivCYsvZZ80zzVx2f_GrxJhyGAUuffrTLbErsUg@mail.gmail.com>
To: Carl Boettiger <cboettig@gmail.com>
Cc: Franck Michel <franck.michel@cnrs.fr>, "LJ.Garcia" <lj.garcia.co@gmail.com>, "Gray, Alasdair J G" <A.J.G.Gray@hw.ac.uk>, "public-bioschemas@w3.org" <public-bioschemas@w3.org>, Robert Guralnick <robgur@gmail.com>
Hi Carl, Franck, Alasdair and all,
at least for me, the taxonName term was created to support findability for
taxonomic names registries, such as Zoobank, Mycobank and IPNI. As these
databases do not keep track of taxa they would be poorly supported by the
use of a taxon term in place of a taxonName term. Having said that, I would
avoid modelling biological taxonomy and nomenclature in bioschemas, because
it's quite a minefield. Therefore, I would keep the relationship between
taxon and taxonName as simple as possible. It should be simple enough to
support finability of resources on the internet, but it is never going to
be rich enough to support an understanding of the nuances of taxonomic
concepts and their interrelationships with taxonNames.
For me, one would use taxonName when your data relates to the publication
and typification of a name, but use taxon when your data is primarily about
the traits of the taxon and other biological features. Clearly, there are
overlaps. I particularly see either option being useful for specimens, but
again it depends on the use case.
I'm not sure if this helps the discuss, but that's my 2 cents worth.
Quentin




On Fri, 14 Feb 2020 at 18:37, Carl Boettiger <cboettig@gmail.com> wrote:

> Hi Franck,
>
> Thanks for the detailed reply and please let me know if we should move
> this discussion over to a GitHub Issue?  Apologies I wasn't up to speed on
> the more recent discussions than what is on the bioschemas website.
>
> I'm have reviewed the threads you link and I very much share the
> sentiments and objectives you have all voiced there and in this thread
> (avoid the debates, leverage existing schema.org vocab whenever
> possible).  Unfortunately, I'm afraid the new proposals sound quite
> confusing.  It seems the proposal to create a new `TaxonName` implicitly
> means that `Taxon` is supposed to effectively mean "TaxonConcept"?  I agree
> TaxonConcept is not an area of consensus, and it's main purpose is to allow
> for discussion in a world where different authorities have
> conflicting/overlapping notions of TaxonConcept, and I'm really not sure we
> want to go that route.
>
> If Taxon is not meant as "the concept of taxon" then I don't see how it is
> different from a TaxonName.  (This is made even more confusing by the fact
> that "name" is also a Property of a taxon).   I think this new proposal is
> much more confusing than the original!  I acknowledge that the "Concept" of
> a Taxon is different than a name, but I think we would be better off not
> attempting to define a class/Type for "TaxonConcept" (since afik the
> experts haven't done that), and we should let the proposal of "@type":
> "schema:Taxon" mean a name, which is how most people see it.  (At it
> simplest, we should think of "Taxon" as merely a name/label we apply to an
> individual specimen, and not worry about defining the 'class of all such
> specimens).
>
>
>
> Defining the inverse pair `hasSynonym` & `synonymOf` sounds reasonable,
> though I do worry a bit about the complexity.  That is, taxonomically,
> `hasSynonym` implies it is property of an "accepted name", while
> `synonymOf` sounds like a property of "the synonym", but in English
> "synonyms" are symmetric, there's no "accepted" one.  I wonder if
> (paralleling the darwin core terms) it would be better to use the optional
> property "acceptedName" (and not define an inverse property).
>
>   "@type" : "Taxon",
>     "name" : "Rollandia micropterum",
>     "@id": "
> http://www.itis.gov/servlet/SingleRpt/SingleRpt?search_topic=TSN&search_value=
> 1000254"
>     "acceptedName": {
>                       "@type": "Taxon",
>                       "name": "Rollandia microptera",
>                       "@id": "
> http://www.itis.gov/servlet/SingleRpt/SingleRpt?search_topic=TSN&search_value=
> 562791"
>                     }
>
> Does that make sense?
>
> Apologies, not trying to open a can of worms here, just aspiring to the
> same goals of avoiding debate and re-using existing terms!
>
> ---
> Carl Boettiger
> http://carlboettiger.info/
>
>
> On Fri, Feb 14, 2020 at 6:32 AM Franck Michel <franck.michel@cnrs.fr>
> wrote:
>
>> Dear Carl, Leyla (+ Quentin who shall certainly be interested in this),
>>
>> I agree that we should do an effort to better explain how the current
>> recommendation aligns with existing vocabularies, specifically Darwin Core.
>>
>> I'll try to describe how we can solve that. I'm sorry this email is
>> pretty long, but I don't know how to be clear and short at the same time ;)
>>
>> There have been quite some discussions in the beginning wrt. what the
>> Taxon term shall refer to: a taxon concept? A taxon name usage? etc. Even
>> experts do not always agree on the definition of those terms. So we agreed
>> on two principles:
>> - Bioschemas should not get into experts' debates, but instead remain at
>> a general level where there is consensus.
>> - we should create as little new terms as possible, that is: rely on
>> existing schema.org terms when revelant, and "import" existing terms
>> from other vocabularies when necessary (this is the Taxon *profile*
>> part).
>>
>> A taxon (instance of type Taxon) is associated with an accepted (or
>> valid) name (schema:name), 0 to any number of synonyms
>> (schema:alternateName), and identifiers from other DBs:
>>
>>     "@type" : "Taxon",
>>     "additionalType": [ "dwc:Taxon",
>> "http://rs.tdwg.org/ontology/voc/TaxonConcept#TaxonConcept"
>> <http://rs.tdwg.org/ontology/voc/TaxonConcept#TaxonConcept> ],
>>     "*name*": "Delphinapterus leucas (Pallas, 1776)",
>>     "*alternateName*": [ "Balaena albicans Muller, 1776", "Beluga
>> catodon Gray, 1846" ],
>>     "identifier": [
>>         {   "@type": "PropertyValue",
>>             "name": "WoRMS id",
>>             "propertyID": "https://www.wikidata.org/entity/P850"
>> <https://www.wikidata.org/entity/P850>,
>>             "value": "137115"
>>         }
>>     ]
>>
>> In further discussions
>> <https://github.com/BioSchemas/specifications/issues/309>, we agreed
>> that modelling only taxa was not sufficient as some databases/portals
>> describe scientific names, not taxa. So we started defining the TaxonName
>> term
>> <https://docs.google.com/spreadsheets/d/1ZZxL6_9VvlDJCXMf_0JnIzyBHExxA6eFIiEDKr6gFqY/edit#gid=1261485211>
>> (which is not yet published on the web site, but I'm on it...). This term
>> allows to give more specific information about a name.
>> Hence the creation of two new properties schema:scientificName and
>> schema:alternateScientificName which are the counterparts of schema:name
>> and schema:alternateName, but with an object of type TaxonTerm insead of a
>> string. One would typically use either one couple of of properties or the
>> other, by they might be used simultaneously though:
>>
>>     "*name*": "Delphinapterus leucas (Pallas, 1776)",
>>     "*alternateName*": [ "Balaena albicans Muller, 1776" ]
>>
>>     "*scientificName*": {
>>         "@type" : "TaxonName",
>>         "name": "Delphinapterus leucas",
>>         "author": "(Pallas, 1776)"
>>     },
>>     "*alternateScientificName*": [
>>         {   "@type" : "TaxonName",
>>             "name": "Balaena albicans",
>>             "author": "Muller, 1776"
>>         }
>>     ]
>>
>> Now, how does this compare with Darwin Core? The pb is that Darwin Core
>> RDF terms describe names and names usages, not taxa. In the example you
>> provide:
>> { "taxonID": "ITIS:1000254",
>>   "scientificName": "Rollandia micropterum",
>>   "acceptedNameUsageID": "ITIS:562791",
>>   "taxonomicStatus": "synonym",
>>   "vernacularName": "Titicaca Grebe"
>> }
>>
>> "ITIS:1000254" actually represents a taxon's name which happens to be a
>> synonym of "ITIS:562791", therefore the need for acceptedNameUsageID and
>> taxonomicStatus.
>> With the Taxon and TaxonName terms, we could write the same thing by
>> first denoting a Taxon with an accepted name (scientificName) and a synonym
>> (alternateScientificName), like this:
>>
>>     "@type" : "Taxon",
>>     "scientificName": {
>>         "@type" : "TaxonName",
>>         "identifier": {
>>             "@type": "PropertyValue",
>>             "name": "ITIS id",
>>             "value": "562791"
>>         }
>>     },
>>     "alternateScientificName": [
>>         {   "@type" : "TaxonName",
>>             "name" : "Rollandia micropterum",
>>             "identifier": {
>>                 "@type": "PropertyValue",
>>                 "name": "ITIS id",
>>                 "value": "1000254"
>>             }
>>         }
>>     ]
>>
>> Still, this seems a bit cumbersome since you just want to represent names
>> but you have to denote a Taxon.
>> So, one option could be to have a new set of properties *hasSynonym/synonymOf
>> *to only denote relationships between TaxonName's instances:
>>
>>     "@type" : "TaxonName",
>>     "name" : "Rollandia micropterum",
>>     "identifier": {
>>         "@type": "PropertyValue",
>>         "name": "ITIS id",
>>         "value": "1000254"
>>     }
>>     "*synonymOf*": {
>>         "@type" : "TaxonName",
>>         "identifier": {
>>             "@type": "PropertyValue",
>>             "name": "ITIS id",
>>             "value": "562791"
>>     }
>>
>> What do you think? Would that work for you?
>>
>> Franck.
>>
>> Le 13/02/2020 à 19:49, Carl Boettiger a écrit :
>>
>> Thanks!
>>
>> Yes, identifiers are of course the solution, the point is that you need
>> two different identifiers and you need to know which is which.  Here's a
>> quick DarwinCore example:
>>
>>  {
>>
>> "taxonID": "ITIS:1000254",
>>
>> "scientificName": "Rollandia micropterum",
>>
>> "acceptedNameUsageID": "ITIS:562791",
>>
>> "taxonomicStatus": "synonym",
>>
>> "vernacularName": "Titicaca Grebe"
>>
>> }
>>
>>
>> We don't need `taxonomicStatus` explicitly here, since it is implied by
>> seeing that the accepted ID (acceptedNameUsageID) is not the same thing as
>> the taxonID for this name.  But we do need two identifiers, and we need to
>> know which one is which.  It's not clear to me how the above would be
>> represented in the schema.org proposal.  (of course one could say "don't
>> use synonyms! but we may as well then say "don't use scientific names, just
>> use accepted identifiers" but we live in a world that uses scientific names
>> so we need these mechanism that can acknowledge some names are synonyms)
>>
>> ---
>> Carl Boettiger
>> http://carlboettiger.info/
>>
>>
>> On Thu, Feb 13, 2020 at 9:58 AM LJ.Garcia <lj.garcia.co@gmail.com> wrote:
>>
>>> Hi Carl, Franck, all,
>>>
>>> @Carl, Franck is probably the best person to point you to
>>> discussions/reasons regarding the property names. I am not much aware of
>>> how synonyms are handled in Darwin Core so my question could be naïve
>>> but... having different identifiers would not help there? Identifiers in
>>> Bioschemas should be FAIR, so, even if the label is the same, the
>>> identifier should tell you better, would not it? Regarding taxonomic
>>> concepts, again, Franck is the one that can answer better.
>>> @Franck, if necessary, further properties could be included at this
>>> point as the submission to schema.org still will take a bit. Also, if
>>> not done already, I would suggest to add examples per property so people
>>> understand better how to use them.
>>>
>>> Kind regards,
>>>
>>> On Wed, Feb 12, 2020 at 5:18 PM Carl Boettiger <cboettig@gmail.com>
>>> wrote:
>>>
>>>> Hi Alasdair,
>>>>
>>>> Thanks for the update and your work on this.  In the spirit of
>>>> demonstrating adoption, I think it would be great if the recommendation
>>>> reflected greater alignment with existing namespaces that are widely used
>>>> in taxonomy, such as Darwin Core, https://dwc.tdwg.org/terms/#taxon .
>>>>
>>>> I think this would greatly facilitate adoption.  For instance, the
>>>> current specification provides no mechanism to disambiguate synonyms (
>>>> https://dwc.tdwg.org/terms/#dwc:taxonomicStatus,
>>>> https://dwc.tdwg.org/terms/#dwc:acceptedNameUsageID) or taxonomic
>>>> concepts.  I'm also unclear on the utility of `childTaxon` and
>>>> `hasDefinedTerm` in the current bioschemas spec.  Apologies if I've missed
>>>> the boat on these discussions already, but these are certainly barriers to
>>>> me in using bioschemas over an existing namespace like Darwin Core.  (Also
>>>> cc'ing Rob Guralnick on this who has far more expertise than I in this area
>>>> and could speak more broadly to the potential for adoption of
>>>> https://bioschemas.org/types/Taxon/0.3-RELEASE-2019_11_18/)
>>>>
>>>> Cheers,
>>>>
>>>> Carl
>>>>
>>>>
>>>>
>>>> ---
>>>> Carl Boettiger
>>>> http://carlboettiger.info/
>>>>
>>>>
>>>> On Wed, Feb 12, 2020 at 4:04 AM Gray, Alasdair J G <A.J.G.Gray@hw.ac.uk>
>>>> wrote:
>>>>
>>>>> Hi Franck,
>>>>>
>>>>> Sorry for the slowness of my response, I have been off work for most
>>>>> of January and am now catching up with things.
>>>>>
>>>>> The status of getting things added to Schema.org is that we need to
>>>>> demonstrate usage of the deployed markup rather than just deployments of
>>>>> it. This is the focus of the latest ELIXIR sponsored project which will be
>>>>> aiming to demonstrate benefit of the markup within specific areas: rare
>>>>> disease, plants, intrinsically disordered proteins, and toxicology. This
>>>>> work will be running over the next 23 months.
>>>>>
>>>>> As such, we should not delay work on other types. So yes, we should
>>>>> progress the work on Taxon and TaxonName.
>>>>>
>>>>> The restructuring of the website that we conducted at the tail end of
>>>>> last year was motivated by making it clearer as to which profiles and types
>>>>> are released for general use and which are still under development.
>>>>>
>>>>> Best regards
>>>>>
>>>>> Alasdair
>>>>>
>>>>> On 11 Feb 2020, at 17:04, LJ.Garcia <lj.garcia.co@gmail.com> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> I am away this week so please allow me some extra days to have a look
>>>>> to this.
>>>>>
>>>>> Kind regards,
>>>>>
>>>>> On Saturday, February 8, 2020, Franck Michel <franck.michel@cnrs.fr>
>>>>> wrote:
>>>>>
>>>>>> Dear Alasdair and Leyla,
>>>>>>
>>>>>> I was wondering if you had time to check my last reply in issue 309
>>>>>> <https://github.com/BioSchemas/specifications/issues/309#issuecomment-576247584>.
>>>>>> I was suggesting that, if endorsing of the Taxon term by schema.org
>>>>>> is still gonna take some time, what about trying to move directly to the
>>>>>> new couple (Taxon, TaxonName) that we have discussed since mid-2019.
>>>>>>
>>>>>> Any thoughts on this?
>>>>>>
>>>>>> Thx,
>>>>>>     Franck.
>>>>>>
>>>>>> --
>>>>>>
>>>>>>
>>>>>> Franck MICHEL - CNRS research engineer
>>>>>> Université Côte d’Azur, CNRS, Inria
>>>>>> I3S laboratory (UMR 7271)
>>>>>> franck.michel@cnrs.fr - +33 (0)4 8915 4277
>>>>>>
>>>>>>
>>>>>
>>>>>
Received on Saturday, 15 February 2020 10:15:02 UTC

This archive was generated by hypermail 2.4.0 : Saturday, 15 February 2020 10:15:03 UTC