- From: Franck Michel <franck.michel@cnrs.fr>
- Date: Tue, 12 Jun 2018 11:02:03 +0200
- To: LJ Garcia Castro <ljgarcia@ebi.ac.uk>, Ricardo Arcila <ricartomojo@gmail.com>
- Cc: public-bioschemas@w3.org, "Gray, Alasdair J G" <A.J.G.Gray@hw.ac.uk>, "Rafael C. Jimenez" <rafael.jimenez@elixir-europe.org>, "Carole Goble (carole.goble@manchester.ac.uk)" <carole.goble@manchester.ac.uk>
- Message-ID: <7847bbfe-4b8b-50bc-c5e2-6fa8a0dcc78d@cnrs.fr>
Dear Ricardo and Leyla,
I just made a pull request, and I created a Biodiversity specification
folder on Google drive. Let me know if anything is not right. I've set
myself as the group leader, but I would feel more comfortable if someone
of the community would join me in this role. And obviously, you are most
welcome to join the group!
> will be Taxon a BioChemEntity? I am asking because in UniProt we have
proteins link to what is defined as an "unknown" taxon in NCBI
taxonomy/UniProt taxonmy. I guess, even if iwe have this "unknown" case,
we could still use BiochemEntity and suppose any "unknow" will be
eventually resolve to an actual entity. Happy to chat about it.
I agree, the large definition of BioChemEntity makes it appropriate as
the root of Taxon. So far, I think of Taxon as a profile more than a
type of its own. I'll read the wiki and start drafting something. I let
you know if (most probably when) I have any question. ;)
Regards,
Franck.
Le 11/06/2018 à 15:46, LJ Garcia Castro a écrit :
>
> Hello Franck,
>
> The taxon profile has been mentioned as one we need before but there
> was no group for it. Wonderful you are starting one now! Please ask
> whenever you have a doubt about the process or the different
> approaches (third-party vocabs or additionalProperty) to deal with
> properties not covered by BioChemEntity.
>
> By the way, will be Taxon a BioChemEntity? I am asking because in
> UniProt we have proteins link to what is defined as an "unknown" taxon
> in NCBI taxonomy/UniProt taxonmy. I guess, even if iwe have this
> "unknown" case, we could still use BiochemEntity and suppose any
> "unknow" will be eventually resolve to an actual entity. Happy to chat
> about it.
>
> Regards,
>
>
>
> On 11/06/2018 14:39, Ricardo Arcila wrote:
>> Hello Franck,
>>
>> It is a good idea to start by creating the group. You can do it by
>> creating a pull request on the bioschemas groups repository
>> <https://github.com/BioSchemas/bioschemas.github.io/tree/master/_groups>.
>> Then you can add yourself on the people repository
>> <https://github.com/BioSchemas/bioschemas.github.io/tree/master/_people>.
>> I will be happy to help you in this process and if you'd like I could
>> be part of the group as well.
>>
>> In order to start a draft specification for Taxon you should create a
>> folder with the profile name on the specifications drive folder
>> <https://drive.google.com/drive/folders/0Bw_p-HKWUjHoNThZOWNKbGhOODg?usp=sharing>.
>> This process its detailed on the bioschemas github wiki
>> <https://github.com/BioSchemas/specifications/wiki/Bioschemas-Specification-Process>.
>>
>> Please let me know if you have any question or doubt about the
>> process, I will be most happy to help.
>>
>>
>> Best regards,
>> Ricardo Arcila
>>
>>
>> On Thu, Jun 7, 2018 at 9:54 AM Franck Michel <franck.michel@cnrs.fr
>> <mailto:franck.michel@cnrs.fr>> wrote:
>>
>> Hi all,
>>
>> I'm catching up with the discussions on the list, and I'm happy
>> to see that things are moving on with the submission of new types
>> to schema.org <http://schema.org>.
>>
>> At the same time, I realize that we did not really go ahead about
>> the biodiversity topic. As I will present a poster about
>> Bioschemas.org at the Biodiversity Information Standard in
>> August, that would maybe be a good thing to initiate the work on
>> this by this date. How do we go on? I suggested the creation of a
>> a Taxon profile, but we may have to start with the creation of a
>> group?
>> Could you please guide me/us in this process?
>>
>> Thx,
>> Franck.
>>
>> Le 23/01/2018 à 11:09, Leyla Garcia a écrit :
>>> Hello Bioschemas governance team,
>>>
>>> What do you think about going ahead with the Biodiversity
>>> schemas? Do we have a heads up?
>>>
>>> @Franck, I am not really aware of those organizations but I am
>>> happy to guide you through the work we have done for Bioschemas
>>> so far. I worked a bit on a biodiversity project but that was
>>> some years ago. Still, I like the subject!
>>>
>>> Let's wait to see what Carole, Rafael and Alasdair suggest.
>>>
>>> Regards,
>>>
>>> On 23/01/2018 08:47, Franck Michel wrote:
>>>> Dear Leyla and all,
>>>>
>>>> I understand that your response stands for a GO. Right?
>>>>
>>>> I've not been involved yet in the specification of the
>>>> Bioschemas.org profiles. So indeed, I shall need help and
>>>> guidance as to how things are going on, the tools, the process,
>>>> the expected outcomes, etc.
>>>>
>>>> As I proposed, we could start with contacting people that would
>>>> potentially be interested in taking part into this. I'm
>>>> thinking about Encyclopedia of Life, Catalogue of Life, GBIF.
>>>> If you already know contacts in these organizations, that would
>>>> certainly be helpful.
>>>>
>>>> Franck.
>>>>
>>>> Le 22/01/2018 à 11:37, Leyla Garcia a écrit :
>>>>> Hi Franck,
>>>>>
>>>>> Great news!
>>>>>
>>>>> Do you need any help/guides for the start-up?
>>>>>
>>>>> Cheers,
>>>>>
>>>>>
>>>>> On 17/01/2018 15:24, Franck Michel wrote:
>>>>>> Dear all,
>>>>>>
>>>>>> I'm following up on this suggestion about creating a
>>>>>> biodiversity-related group in Bioschemas.org.
>>>>>>
>>>>>> The proposition received four +1's. I'm not sure if there is
>>>>>> a "minimum score" to attest of sufficient consensus.
>>>>>>
>>>>>> As we discussed, if we go for the creation of this group, it
>>>>>> would be beneficial to involve at least EoL folks, possibly
>>>>>> other people from the biodiversity community. I can try to
>>>>>> initiate this, yet before I would like to have an official GO
>>>>>> from our community.
>>>>>>
>>>>>> Let me know how this usually works, and what you think about
>>>>>> this.
>>>>>>
>>>>>> Regards,
>>>>>> Franck.
>>>>>>
>>>>>> Le 17/11/2017 à 16:40, Franck Michel a écrit :
>>>>>>> Hi Mélanie, hi all,
>>>>>>>
>>>>>>> To go a bit further I've tried to somewhat extend the
>>>>>>> example I've initiated. There it is:
>>>>>>> https://github.com/frmichel/taxref-ld/tree/master/bioschemas-org
>>>>>>> The README gives details as to how the example file is
>>>>>>> organized, and more importantly it lists some of the issues
>>>>>>> and questions that we shall have to tackle if we officially
>>>>>>> start the group.
>>>>>>>
>>>>>>> @Alasdair, Carole, Rafael: as discussed in the thread, at
>>>>>>> some point it shall be beneficial to to invite people from
>>>>>>> EoL and TDWG. Is there some sort of "official" channel for
>>>>>>> the community to do that?
>>>>>>>
>>>>>>> Have a nice week-end,
>>>>>>> Franck.
>>>>>>>
>>>>>>> Le 17/11/2017 à 10:19, Melanie Courtot a écrit :
>>>>>>>> Hi Frank, all,
>>>>>>>>
>>>>>>>> On 16/11/2017 09:37, Franck Michel wrote:
>>>>>>>>> Hi Meanie, hi all,
>>>>>>>>>
>>>>>>>>> EoL provides an API that returns species descriptions as
>>>>>>>>> JSON-LD based on schemas.org <http://schemas.org>. Beluga
>>>>>>>>> example: http://eol.org/api/traits/328541
>>>>>>>>> It is unclear who consumes this data, but at least, as you
>>>>>>>>> already saw, they embed it at the end of their own web
>>>>>>>>> pages such as http://eol.org/pages/328541/data.
>>>>>>>> BioSamples does the same - an API to retrieve JSON and we
>>>>>>>> embed it in our webpages for crawler as well.
>>>>>>>>>
>>>>>>>>> As you also noticed, the JSON-LD they provide is not
>>>>>>>>> valid. I didn't know about that EOL Github issue, but I
>>>>>>>>> recently discussed it with Rod Page from the Biodiversity
>>>>>>>>> Information Standards (aka TDWG), who replied on the
>>>>>>>>> Github issue. The Google structured data testing tool
>>>>>>>>> gives more details on that: https://frama.link/xJm0AAto
>>>>>>>>> Besides, other errors are not reported (well, I think
>>>>>>>>> these are errors): property scienfiticName without any
>>>>>>>>> namespace is invalid, that should be dwc:scientificName
>>>>>>>>> since this does not exist in schema.org
>>>>>>>>> <http://schema.org>. Same issue for vernacularName,
>>>>>>>>> traits, units...
>>>>>>>>>
>>>>>>>>> But whatever, this JSON-LD has lots of issues, but it's a
>>>>>>>>> start.
>>>>>>>>
>>>>>>>> Yes. Only mentioned the tweaks in case someone wanted to
>>>>>>>> give it a try as well.
>>>>>>>>
>>>>>>>>> The assumption is that there is some sort of specific
>>>>>>>>> (one-to-one) agreement between EoL and Google, and that
>>>>>>>>> Google harvests this data despite the invalid JSON-LD. But
>>>>>>>>> I have no confirmation of that
>>>>>>>>
>>>>>>>> It'd be interesting to clarify this. It seems a little bit
>>>>>>>> counter intuitive that EoL would mark their pages up with
>>>>>>>> JSON for Google to read it but then Google couldn't do so
>>>>>>>> without a special adapter? We're probably missing a piece
>>>>>>>> of the story.
>>>>>>>>>
>>>>>>>>> > - the measurement type points to
>>>>>>>>> http://purl.obolibrary.org/obo/VT_0001256, which is body
>>>>>>>>> length. The schema.org/predicate
>>>>>>>>> <http://schema.org/predicate> value is also "body length
>>>>>>>>> (VT)". How is this understood and displayed as Length on
>>>>>>>>> the Google result?
>>>>>>>>> - Similar question for the actual value and units, which
>>>>>>>>> are "4249.83" and "mm" respectively. Is Google doing some
>>>>>>>>> sort of unit conversion/roundup for display?
>>>>>>>>>
>>>>>>>>> Good question. Typically about the unit "mm":
>>>>>>>>> - "units": "mm" => there is no such thing as
>>>>>>>>> http://schema.org/units
>>>>>>>>> - "dwc:measurementUnit":
>>>>>>>>> "http://purl.obolibrary.org/obo/UO_0000016"
>>>>>>>>> <http://purl.obolibrary.org/obo/UO_0000016> => this seems
>>>>>>>>> to be the only reliable property, but then Google knows
>>>>>>>>> the Darwin Core vocabulary and interprets it.
>>>>>>>>> My assumption is that Google performs some treatment on
>>>>>>>>> the values. Possibly, they developed a specific connector
>>>>>>>>> to cope with EoL JSON-LD and translate this body size to
>>>>>>>>> "4.2 m".
>>>>>>>>> Besides, the snippet mentions "4.2 m *(Adult)*", so they
>>>>>>>>> also presumably consider this property:
>>>>>>>>> eol:traitUri"http://eol.org/resources/704/measurements/adultheadbodylen27"
>>>>>>>>> <http://eol.org/resources/704/measurements/adultheadbodylen27>
>>>>>>>>> to know that this is the size of an adult.
>>>>>>>>>
>>>>>>>>> With proper Bioschemas.org profiles, I think we could
>>>>>>>>> annotate pages from many other institutions, such as the
>>>>>>>>> Beluga page
>>>>>>>>> <https://inpn.mnhn.fr/espece/cd_nom/60932?lg%3Den> on the
>>>>>>>>> french National Museum of Natural History, and in turn,
>>>>>>>>> enable search engines to harvest data from complimentary
>>>>>>>>> pages and produce mashups of related pages, etc.
>>>>>>>> That sounds like a great idea and entirely within the scope
>>>>>>>> of Bioschemas.
>>>>>>>>>
>>>>>>>>> At this point, I think we should involve people from EoL,
>>>>>>>>> and from the TDWG community (Rod Page would certainly be
>>>>>>>>> of great added value in this respect). What do you think?
>>>>>>>>> Is there a procedure for inviting people "officially"?
>>>>>>>> I think we could benefit from their experience indeed; it
>>>>>>>> seems they were able to deploy markup, add additional
>>>>>>>> properties and then get this to be interpreted by Google
>>>>>>>> which seems to match our use case pretty well!
>>>>>>>> I +1'd the issue at
>>>>>>>> https://github.com/BioSchemas/specifications/issues/115
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Melanie
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Franck.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Le 15/11/2017 à 17:57, Melanie Courtot a écrit :
>>>>>>>>>> Hi Frank,
>>>>>>>>>>
>>>>>>>>>> This looks really interesting, thanks for bringing it up.
>>>>>>>>>> I was trying to find out how the interaction between EoL
>>>>>>>>>> and schema.org <http://schema.org> was working and am
>>>>>>>>>> wondering if you (or someone else!) could shed some light
>>>>>>>>>> on this?
>>>>>>>>>>
>>>>>>>>>> As you suggested in the below, I checked the google
>>>>>>>>>> beluga
>>>>>>>>>> <https://www.google.fr/search?dcr=0&ei=ml74WajPMMzWUabjqvAF&q=beluga&oq=beluga&gs_l=psy-ab.3...19519.20929.0.20945.6.3.0.0.0.0.93.93.1.1.0....0...1.1.64.psy-ab..5.1.92...0j0i131k1.0.AGNziTItYzc>
>>>>>>>>>> search result and do see the line "Length: 4.2 m (Adult)
>>>>>>>>>> Encyclopedia of Life"
>>>>>>>>>>
>>>>>>>>>> If I try to find where that info comes from, and head to
>>>>>>>>>> EoL, I can reach the page
>>>>>>>>>> http://eol.org/pages/328541/overview, and follow the "see
>>>>>>>>>> all traits" link to http://eol.org/pages/328541/data
>>>>>>>>>> which contains the JSON-LD.
>>>>>>>>>>
>>>>>>>>>> I trimmed it down to extract the relevant bit, updated
>>>>>>>>>> the id to be a string as per
>>>>>>>>>> https://github.com/EOL/tramea/issues/352, and pasted it
>>>>>>>>>> in the JSON playground mostly to make sure it was working
>>>>>>>>>> as expected: http://tinyurl.com/yadam6nj
>>>>>>>>>>
>>>>>>>>>> I am missing the link of how the following happens:
>>>>>>>>>> - the measurement type points to
>>>>>>>>>> http://purl.obolibrary.org/obo/VT_0001256, which is body
>>>>>>>>>> length. The schema.org/predicate
>>>>>>>>>> <http://schema.org/predicate> value is also "body length
>>>>>>>>>> (VT)". How is this understood and displayed as Length on
>>>>>>>>>> the Google result?
>>>>>>>>>> - Similar question for the actual value and units, which
>>>>>>>>>> are "4249.83" and "mm" respectively. Is Google doing some
>>>>>>>>>> sort of unit conversion/roundup for display?
>>>>>>>>>> - Trophic level on EoL is "carnivore", but Google
>>>>>>>>>> displays "Carnivorous"
>>>>>>>>>> etc
>>>>>>>>>>
>>>>>>>>>> Or am I looking at the wrong source for the markup?
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Melanie
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 10/11/2017 15:17, Franck Michel wrote:
>>>>>>>>>>> Dear all,
>>>>>>>>>>>
>>>>>>>>>>> I've just joined the Bioschemas.org community following
>>>>>>>>>>> some discussions I had with Alasdair Gray whom I met at
>>>>>>>>>>> ISWC in Vienna, and I'd like to start a new discussion
>>>>>>>>>>> thread.
>>>>>>>>>>>
>>>>>>>>>>> So, just to start, a few words about me. I'm a CNRS
>>>>>>>>>>> research engineer, I work at the I3S laboratory in
>>>>>>>>>>> France, in particular with the Wimmics research team led
>>>>>>>>>>> by Fabien Gandon. I'm currently involved in some
>>>>>>>>>>> activities related to the publication of taxonomic
>>>>>>>>>>> information as Linked Data [1]. In this context, I've
>>>>>>>>>>> met the Biodiversity Information Standards community
>>>>>>>>>>> (TDWG) that is increasingly considering SW standards, LD
>>>>>>>>>>> publication and web pages markup. This is a domain
>>>>>>>>>>> where, I think, it would be relevant for
>>>>>>>>>>> Bioschemas.orgto get involved.
>>>>>>>>>>>
>>>>>>>>>>> There exist lots of web portals reporting observations,
>>>>>>>>>>> traits and other data about all sorts of living
>>>>>>>>>>> organisms. Encyclopedia of Life <http://eol.org/> (EoL)
>>>>>>>>>>> and the Global Biodiversity Information Facility
>>>>>>>>>>> <https://www.gbif.org/> (GBIF) are some of the most well
>>>>>>>>>>> known. Markup questions are actively considered in this
>>>>>>>>>>> field, for instance EoL web pages embed
>>>>>>>>>>> schemas.org-based JSON-LD descriptions that Google
>>>>>>>>>>> leverages to enrich their snippets: e.g. if you google
>>>>>>>>>>> beluga
>>>>>>>>>>> <https://www.google.fr/search?dcr=0&ei=ml74WajPMMzWUabjqvAF&q=beluga&oq=beluga&gs_l=psy-ab.3...19519.20929.0.20945.6.3.0.0.0.0.93.93.1.1.0....0...1.1.64.psy-ab..5.1.92...0j0i131k1.0.AGNziTItYzc>
>>>>>>>>>>> you shall see 'Encyclopedia of Life' mentions in the
>>>>>>>>>>> snippet providing average weight and size data. For now,
>>>>>>>>>>> this seems to be an "individual" initiative between EoL
>>>>>>>>>>> and Google/schemas.org <http://schemas.org>, but it
>>>>>>>>>>> would make sense if this was part of a broader
>>>>>>>>>>> reflection led by Bioschemas.org.
>>>>>>>>>>>
>>>>>>>>>>> My opinion is that fostering the use of common markup by
>>>>>>>>>>> these portals could be very effective in helping the
>>>>>>>>>>> biodiversity community to discover information and
>>>>>>>>>>> figure out new data integration scenarios.Within
>>>>>>>>>>> Bioschemas.org, we could define profiles to account for
>>>>>>>>>>> biodiversity-related information.Taxonomic registers are
>>>>>>>>>>> used as the backbone of many web portals, apps and
>>>>>>>>>>> databases related to biodiversity, agronomy and
>>>>>>>>>>> agriculture.For instance, EoL and GBIF both rely on the
>>>>>>>>>>> Catalog of Life <http://www.catalogueoflife.org/>
>>>>>>>>>>> taxonomy. Therefore, we could start with the definition
>>>>>>>>>>> of a profile to describe a taxon and the related
>>>>>>>>>>> scientific and vernacular names thereof. Then, this
>>>>>>>>>>> could be extended with the representation of traits
>>>>>>>>>>> (characteristics of biological organisms), observations,
>>>>>>>>>>> occurrence data, conservation status (e.g. endangered)
>>>>>>>>>>> etc. There already exist vocabularies for such data such
>>>>>>>>>>> as the well-adopted Darwin Core terms.
>>>>>>>>>>>
>>>>>>>>>>> As a quick example, consider the web page describing the
>>>>>>>>>>> common dolphin on the web site of the french Museum of
>>>>>>>>>>> Natural History:
>>>>>>>>>>> https://inpn.mnhn.fr/espece/cd_nom/60878?lg=en. This
>>>>>>>>>>> page could come with a JSON-LD desciption looking like
>>>>>>>>>>> this:
>>>>>>>>>>> https://github.com/frmichel/taxref-ld/blob/master/bioschemas-org-example.json
>>>>>>>>>>> This example is naive and very succinct, and there are
>>>>>>>>>>> lots of things to discuss and decide. Besides, I've just
>>>>>>>>>>> registered on the mailing yesterday, so it may not fit
>>>>>>>>>>> with good practices that you guys have already agreed
>>>>>>>>>>> upon. Sorry if this is the case. Nevertheless, my point
>>>>>>>>>>> is basically to bootstrap the discussion and see if the
>>>>>>>>>>> community is willing to endorse this initiative. If this
>>>>>>>>>>> is the case, we should probably involve people from the
>>>>>>>>>>> biodiversity community: Darwin Core experts, EoL/GBIF
>>>>>>>>>>> representatives etc. But that will come in time.
>>>>>>>>>>>
>>>>>>>>>>> I look forward to further discussions.
>>>>>>>>>>> Regards,
>>>>>>>>>>> Franck.
>>>>>>>>>>>
>>>>>>>>>>> [1] Michel F., Gargominy O., Tercerie S. & Faron-Zucker
>>>>>>>>>>> C. (2017). A Model to Represent Nomenclatural and
>>>>>>>>>>> Taxonomic Information as Linked Data. Application to the
>>>>>>>>>>> French Taxonomic Register, TAXREF. In Proceedings of the
>>>>>>>>>>> 2nd International Workshop on Semantics for Biodiversity
>>>>>>>>>>> (S4BioDiv) co-located with ISWC 2017 vol. 1933. Vienna,
>>>>>>>>>>> Austria. CEUR.
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>>
>>>>>>>>>>> Franck MICHEL
>>>>>>>>>>> CNRS research engineer
>>>>>>>>>>> +33 (0)492 96 5004
>>>>>>>>>>> franck.michel@cnrs.fr <mailto:franck.michel@cnrs.fr>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Université Côte d’Azur, CNRS, *Inria* - I3S - UMR 7271
>>>>>>>>>>> 930 route des Colles - Bât. Les Templiers
>>>>>>>>>>> BP 145 - 06903 Sophia Antipolis CEDEX - France
>>>>>>>>>>> Tel. +33 (0)4 9294 2680
>>>>>>>>>>> <tel:+33%204%2092%2094%2026%2080>, Fax : +33 (0)4 9294 2898
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>> --
>>
>> Franck MICHEL
>> CNRS research engineer
>> +33 (0)4 8915 4277
>> franck.michel@cnrs.fr <mailto:franck.michel@cnrs.fr>
>>
>>
>>
>> Université Côte d’Azur, CNRS- I3S - UMR 7271
>> 930 route des Colles
>> <https://maps.google.com/?q=930+route+des+Colles&entry=gmail&source=g>
>> - Bât. Les Templiers
>> BP 145 - 06903 Sophia Antipolis CEDEX - France
>> Tel. +33 (0)4 9294 2680 <tel:+33%204%2092%2094%2026%2080>
>>
>
Received on Tuesday, 12 June 2018 09:02:35 UTC