- From: Franck Michel <franck.michel@cnrs.fr>
- Date: Tue, 12 Jun 2018 11:02:03 +0200
- To: LJ Garcia Castro <ljgarcia@ebi.ac.uk>, Ricardo Arcila <ricartomojo@gmail.com>
- Cc: public-bioschemas@w3.org, "Gray, Alasdair J G" <A.J.G.Gray@hw.ac.uk>, "Rafael C. Jimenez" <rafael.jimenez@elixir-europe.org>, "Carole Goble (carole.goble@manchester.ac.uk)" <carole.goble@manchester.ac.uk>
- Message-ID: <7847bbfe-4b8b-50bc-c5e2-6fa8a0dcc78d@cnrs.fr>
Dear Ricardo and Leyla, I just made a pull request, and I created a Biodiversity specification folder on Google drive. Let me know if anything is not right. I've set myself as the group leader, but I would feel more comfortable if someone of the community would join me in this role. And obviously, you are most welcome to join the group! > will be Taxon a BioChemEntity? I am asking because in UniProt we have proteins link to what is defined as an "unknown" taxon in NCBI taxonomy/UniProt taxonmy. I guess, even if iwe have this "unknown" case, we could still use BiochemEntity and suppose any "unknow" will be eventually resolve to an actual entity. Happy to chat about it. I agree, the large definition of BioChemEntity makes it appropriate as the root of Taxon. So far, I think of Taxon as a profile more than a type of its own. I'll read the wiki and start drafting something. I let you know if (most probably when) I have any question. ;) Regards, Franck. Le 11/06/2018 à 15:46, LJ Garcia Castro a écrit : > > Hello Franck, > > The taxon profile has been mentioned as one we need before but there > was no group for it. Wonderful you are starting one now! Please ask > whenever you have a doubt about the process or the different > approaches (third-party vocabs or additionalProperty) to deal with > properties not covered by BioChemEntity. > > By the way, will be Taxon a BioChemEntity? I am asking because in > UniProt we have proteins link to what is defined as an "unknown" taxon > in NCBI taxonomy/UniProt taxonmy. I guess, even if iwe have this > "unknown" case, we could still use BiochemEntity and suppose any > "unknow" will be eventually resolve to an actual entity. Happy to chat > about it. > > Regards, > > > > On 11/06/2018 14:39, Ricardo Arcila wrote: >> Hello Franck, >> >> It is a good idea to start by creating the group. You can do it by >> creating a pull request on the bioschemas groups repository >> <https://github.com/BioSchemas/bioschemas.github.io/tree/master/_groups>. >> Then you can add yourself on the people repository >> <https://github.com/BioSchemas/bioschemas.github.io/tree/master/_people>. >> I will be happy to help you in this process and if you'd like I could >> be part of the group as well. >> >> In order to start a draft specification for Taxon you should create a >> folder with the profile name on the specifications drive folder >> <https://drive.google.com/drive/folders/0Bw_p-HKWUjHoNThZOWNKbGhOODg?usp=sharing>. >> This process its detailed on the bioschemas github wiki >> <https://github.com/BioSchemas/specifications/wiki/Bioschemas-Specification-Process>. >> >> Please let me know if you have any question or doubt about the >> process, I will be most happy to help. >> >> >> Best regards, >> Ricardo Arcila >> >> >> On Thu, Jun 7, 2018 at 9:54 AM Franck Michel <franck.michel@cnrs.fr >> <mailto:franck.michel@cnrs.fr>> wrote: >> >> Hi all, >> >> I'm catching up with the discussions on the list, and I'm happy >> to see that things are moving on with the submission of new types >> to schema.org <http://schema.org>. >> >> At the same time, I realize that we did not really go ahead about >> the biodiversity topic. As I will present a poster about >> Bioschemas.org at the Biodiversity Information Standard in >> August, that would maybe be a good thing to initiate the work on >> this by this date. How do we go on? I suggested the creation of a >> a Taxon profile, but we may have to start with the creation of a >> group? >> Could you please guide me/us in this process? >> >> Thx, >> Franck. >> >> Le 23/01/2018 à 11:09, Leyla Garcia a écrit : >>> Hello Bioschemas governance team, >>> >>> What do you think about going ahead with the Biodiversity >>> schemas? Do we have a heads up? >>> >>> @Franck, I am not really aware of those organizations but I am >>> happy to guide you through the work we have done for Bioschemas >>> so far. I worked a bit on a biodiversity project but that was >>> some years ago. Still, I like the subject! >>> >>> Let's wait to see what Carole, Rafael and Alasdair suggest. >>> >>> Regards, >>> >>> On 23/01/2018 08:47, Franck Michel wrote: >>>> Dear Leyla and all, >>>> >>>> I understand that your response stands for a GO. Right? >>>> >>>> I've not been involved yet in the specification of the >>>> Bioschemas.org profiles. So indeed, I shall need help and >>>> guidance as to how things are going on, the tools, the process, >>>> the expected outcomes, etc. >>>> >>>> As I proposed, we could start with contacting people that would >>>> potentially be interested in taking part into this. I'm >>>> thinking about Encyclopedia of Life, Catalogue of Life, GBIF. >>>> If you already know contacts in these organizations, that would >>>> certainly be helpful. >>>> >>>> Franck. >>>> >>>> Le 22/01/2018 à 11:37, Leyla Garcia a écrit : >>>>> Hi Franck, >>>>> >>>>> Great news! >>>>> >>>>> Do you need any help/guides for the start-up? >>>>> >>>>> Cheers, >>>>> >>>>> >>>>> On 17/01/2018 15:24, Franck Michel wrote: >>>>>> Dear all, >>>>>> >>>>>> I'm following up on this suggestion about creating a >>>>>> biodiversity-related group in Bioschemas.org. >>>>>> >>>>>> The proposition received four +1's. I'm not sure if there is >>>>>> a "minimum score" to attest of sufficient consensus. >>>>>> >>>>>> As we discussed, if we go for the creation of this group, it >>>>>> would be beneficial to involve at least EoL folks, possibly >>>>>> other people from the biodiversity community. I can try to >>>>>> initiate this, yet before I would like to have an official GO >>>>>> from our community. >>>>>> >>>>>> Let me know how this usually works, and what you think about >>>>>> this. >>>>>> >>>>>> Regards, >>>>>> Franck. >>>>>> >>>>>> Le 17/11/2017 à 16:40, Franck Michel a écrit : >>>>>>> Hi Mélanie, hi all, >>>>>>> >>>>>>> To go a bit further I've tried to somewhat extend the >>>>>>> example I've initiated. There it is: >>>>>>> https://github.com/frmichel/taxref-ld/tree/master/bioschemas-org >>>>>>> The README gives details as to how the example file is >>>>>>> organized, and more importantly it lists some of the issues >>>>>>> and questions that we shall have to tackle if we officially >>>>>>> start the group. >>>>>>> >>>>>>> @Alasdair, Carole, Rafael: as discussed in the thread, at >>>>>>> some point it shall be beneficial to to invite people from >>>>>>> EoL and TDWG. Is there some sort of "official" channel for >>>>>>> the community to do that? >>>>>>> >>>>>>> Have a nice week-end, >>>>>>> Franck. >>>>>>> >>>>>>> Le 17/11/2017 à 10:19, Melanie Courtot a écrit : >>>>>>>> Hi Frank, all, >>>>>>>> >>>>>>>> On 16/11/2017 09:37, Franck Michel wrote: >>>>>>>>> Hi Meanie, hi all, >>>>>>>>> >>>>>>>>> EoL provides an API that returns species descriptions as >>>>>>>>> JSON-LD based on schemas.org <http://schemas.org>. Beluga >>>>>>>>> example: http://eol.org/api/traits/328541 >>>>>>>>> It is unclear who consumes this data, but at least, as you >>>>>>>>> already saw, they embed it at the end of their own web >>>>>>>>> pages such as http://eol.org/pages/328541/data. >>>>>>>> BioSamples does the same - an API to retrieve JSON and we >>>>>>>> embed it in our webpages for crawler as well. >>>>>>>>> >>>>>>>>> As you also noticed, the JSON-LD they provide is not >>>>>>>>> valid. I didn't know about that EOL Github issue, but I >>>>>>>>> recently discussed it with Rod Page from the Biodiversity >>>>>>>>> Information Standards (aka TDWG), who replied on the >>>>>>>>> Github issue. The Google structured data testing tool >>>>>>>>> gives more details on that: https://frama.link/xJm0AAto >>>>>>>>> Besides, other errors are not reported (well, I think >>>>>>>>> these are errors): property scienfiticName without any >>>>>>>>> namespace is invalid, that should be dwc:scientificName >>>>>>>>> since this does not exist in schema.org >>>>>>>>> <http://schema.org>. Same issue for vernacularName, >>>>>>>>> traits, units... >>>>>>>>> >>>>>>>>> But whatever, this JSON-LD has lots of issues, but it's a >>>>>>>>> start. >>>>>>>> >>>>>>>> Yes. Only mentioned the tweaks in case someone wanted to >>>>>>>> give it a try as well. >>>>>>>> >>>>>>>>> The assumption is that there is some sort of specific >>>>>>>>> (one-to-one) agreement between EoL and Google, and that >>>>>>>>> Google harvests this data despite the invalid JSON-LD. But >>>>>>>>> I have no confirmation of that >>>>>>>> >>>>>>>> It'd be interesting to clarify this. It seems a little bit >>>>>>>> counter intuitive that EoL would mark their pages up with >>>>>>>> JSON for Google to read it but then Google couldn't do so >>>>>>>> without a special adapter? We're probably missing a piece >>>>>>>> of the story. >>>>>>>>> >>>>>>>>> > - the measurement type points to >>>>>>>>> http://purl.obolibrary.org/obo/VT_0001256, which is body >>>>>>>>> length. The schema.org/predicate >>>>>>>>> <http://schema.org/predicate> value is also "body length >>>>>>>>> (VT)". How is this understood and displayed as Length on >>>>>>>>> the Google result? >>>>>>>>> - Similar question for the actual value and units, which >>>>>>>>> are "4249.83" and "mm" respectively. Is Google doing some >>>>>>>>> sort of unit conversion/roundup for display? >>>>>>>>> >>>>>>>>> Good question. Typically about the unit "mm": >>>>>>>>> - "units": "mm" => there is no such thing as >>>>>>>>> http://schema.org/units >>>>>>>>> - "dwc:measurementUnit": >>>>>>>>> "http://purl.obolibrary.org/obo/UO_0000016" >>>>>>>>> <http://purl.obolibrary.org/obo/UO_0000016> => this seems >>>>>>>>> to be the only reliable property, but then Google knows >>>>>>>>> the Darwin Core vocabulary and interprets it. >>>>>>>>> My assumption is that Google performs some treatment on >>>>>>>>> the values. Possibly, they developed a specific connector >>>>>>>>> to cope with EoL JSON-LD and translate this body size to >>>>>>>>> "4.2 m". >>>>>>>>> Besides, the snippet mentions "4.2 m *(Adult)*", so they >>>>>>>>> also presumably consider this property: >>>>>>>>> eol:traitUri"http://eol.org/resources/704/measurements/adultheadbodylen27" >>>>>>>>> <http://eol.org/resources/704/measurements/adultheadbodylen27> >>>>>>>>> to know that this is the size of an adult. >>>>>>>>> >>>>>>>>> With proper Bioschemas.org profiles, I think we could >>>>>>>>> annotate pages from many other institutions, such as the >>>>>>>>> Beluga page >>>>>>>>> <https://inpn.mnhn.fr/espece/cd_nom/60932?lg%3Den> on the >>>>>>>>> french National Museum of Natural History, and in turn, >>>>>>>>> enable search engines to harvest data from complimentary >>>>>>>>> pages and produce mashups of related pages, etc. >>>>>>>> That sounds like a great idea and entirely within the scope >>>>>>>> of Bioschemas. >>>>>>>>> >>>>>>>>> At this point, I think we should involve people from EoL, >>>>>>>>> and from the TDWG community (Rod Page would certainly be >>>>>>>>> of great added value in this respect). What do you think? >>>>>>>>> Is there a procedure for inviting people "officially"? >>>>>>>> I think we could benefit from their experience indeed; it >>>>>>>> seems they were able to deploy markup, add additional >>>>>>>> properties and then get this to be interpreted by Google >>>>>>>> which seems to match our use case pretty well! >>>>>>>> I +1'd the issue at >>>>>>>> https://github.com/BioSchemas/specifications/issues/115 >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Melanie >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> Franck. >>>>>>>>> >>>>>>>>> >>>>>>>>> Le 15/11/2017 à 17:57, Melanie Courtot a écrit : >>>>>>>>>> Hi Frank, >>>>>>>>>> >>>>>>>>>> This looks really interesting, thanks for bringing it up. >>>>>>>>>> I was trying to find out how the interaction between EoL >>>>>>>>>> and schema.org <http://schema.org> was working and am >>>>>>>>>> wondering if you (or someone else!) could shed some light >>>>>>>>>> on this? >>>>>>>>>> >>>>>>>>>> As you suggested in the below, I checked the google >>>>>>>>>> beluga >>>>>>>>>> <https://www.google.fr/search?dcr=0&ei=ml74WajPMMzWUabjqvAF&q=beluga&oq=beluga&gs_l=psy-ab.3...19519.20929.0.20945.6.3.0.0.0.0.93.93.1.1.0....0...1.1.64.psy-ab..5.1.92...0j0i131k1.0.AGNziTItYzc> >>>>>>>>>> search result and do see the line "Length: 4.2 m (Adult) >>>>>>>>>> Encyclopedia of Life" >>>>>>>>>> >>>>>>>>>> If I try to find where that info comes from, and head to >>>>>>>>>> EoL, I can reach the page >>>>>>>>>> http://eol.org/pages/328541/overview, and follow the "see >>>>>>>>>> all traits" link to http://eol.org/pages/328541/data >>>>>>>>>> which contains the JSON-LD. >>>>>>>>>> >>>>>>>>>> I trimmed it down to extract the relevant bit, updated >>>>>>>>>> the id to be a string as per >>>>>>>>>> https://github.com/EOL/tramea/issues/352, and pasted it >>>>>>>>>> in the JSON playground mostly to make sure it was working >>>>>>>>>> as expected: http://tinyurl.com/yadam6nj >>>>>>>>>> >>>>>>>>>> I am missing the link of how the following happens: >>>>>>>>>> - the measurement type points to >>>>>>>>>> http://purl.obolibrary.org/obo/VT_0001256, which is body >>>>>>>>>> length. The schema.org/predicate >>>>>>>>>> <http://schema.org/predicate> value is also "body length >>>>>>>>>> (VT)". How is this understood and displayed as Length on >>>>>>>>>> the Google result? >>>>>>>>>> - Similar question for the actual value and units, which >>>>>>>>>> are "4249.83" and "mm" respectively. Is Google doing some >>>>>>>>>> sort of unit conversion/roundup for display? >>>>>>>>>> - Trophic level on EoL is "carnivore", but Google >>>>>>>>>> displays "Carnivorous" >>>>>>>>>> etc >>>>>>>>>> >>>>>>>>>> Or am I looking at the wrong source for the markup? >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> Melanie >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 10/11/2017 15:17, Franck Michel wrote: >>>>>>>>>>> Dear all, >>>>>>>>>>> >>>>>>>>>>> I've just joined the Bioschemas.org community following >>>>>>>>>>> some discussions I had with Alasdair Gray whom I met at >>>>>>>>>>> ISWC in Vienna, and I'd like to start a new discussion >>>>>>>>>>> thread. >>>>>>>>>>> >>>>>>>>>>> So, just to start, a few words about me. I'm a CNRS >>>>>>>>>>> research engineer, I work at the I3S laboratory in >>>>>>>>>>> France, in particular with the Wimmics research team led >>>>>>>>>>> by Fabien Gandon. I'm currently involved in some >>>>>>>>>>> activities related to the publication of taxonomic >>>>>>>>>>> information as Linked Data [1]. In this context, I've >>>>>>>>>>> met the Biodiversity Information Standards community >>>>>>>>>>> (TDWG) that is increasingly considering SW standards, LD >>>>>>>>>>> publication and web pages markup. This is a domain >>>>>>>>>>> where, I think, it would be relevant for >>>>>>>>>>> Bioschemas.orgto get involved. >>>>>>>>>>> >>>>>>>>>>> There exist lots of web portals reporting observations, >>>>>>>>>>> traits and other data about all sorts of living >>>>>>>>>>> organisms. Encyclopedia of Life <http://eol.org/> (EoL) >>>>>>>>>>> and the Global Biodiversity Information Facility >>>>>>>>>>> <https://www.gbif.org/> (GBIF) are some of the most well >>>>>>>>>>> known. Markup questions are actively considered in this >>>>>>>>>>> field, for instance EoL web pages embed >>>>>>>>>>> schemas.org-based JSON-LD descriptions that Google >>>>>>>>>>> leverages to enrich their snippets: e.g. if you google >>>>>>>>>>> beluga >>>>>>>>>>> <https://www.google.fr/search?dcr=0&ei=ml74WajPMMzWUabjqvAF&q=beluga&oq=beluga&gs_l=psy-ab.3...19519.20929.0.20945.6.3.0.0.0.0.93.93.1.1.0....0...1.1.64.psy-ab..5.1.92...0j0i131k1.0.AGNziTItYzc> >>>>>>>>>>> you shall see 'Encyclopedia of Life' mentions in the >>>>>>>>>>> snippet providing average weight and size data. For now, >>>>>>>>>>> this seems to be an "individual" initiative between EoL >>>>>>>>>>> and Google/schemas.org <http://schemas.org>, but it >>>>>>>>>>> would make sense if this was part of a broader >>>>>>>>>>> reflection led by Bioschemas.org. >>>>>>>>>>> >>>>>>>>>>> My opinion is that fostering the use of common markup by >>>>>>>>>>> these portals could be very effective in helping the >>>>>>>>>>> biodiversity community to discover information and >>>>>>>>>>> figure out new data integration scenarios.Within >>>>>>>>>>> Bioschemas.org, we could define profiles to account for >>>>>>>>>>> biodiversity-related information.Taxonomic registers are >>>>>>>>>>> used as the backbone of many web portals, apps and >>>>>>>>>>> databases related to biodiversity, agronomy and >>>>>>>>>>> agriculture.For instance, EoL and GBIF both rely on the >>>>>>>>>>> Catalog of Life <http://www.catalogueoflife.org/> >>>>>>>>>>> taxonomy. Therefore, we could start with the definition >>>>>>>>>>> of a profile to describe a taxon and the related >>>>>>>>>>> scientific and vernacular names thereof. Then, this >>>>>>>>>>> could be extended with the representation of traits >>>>>>>>>>> (characteristics of biological organisms), observations, >>>>>>>>>>> occurrence data, conservation status (e.g. endangered) >>>>>>>>>>> etc. There already exist vocabularies for such data such >>>>>>>>>>> as the well-adopted Darwin Core terms. >>>>>>>>>>> >>>>>>>>>>> As a quick example, consider the web page describing the >>>>>>>>>>> common dolphin on the web site of the french Museum of >>>>>>>>>>> Natural History: >>>>>>>>>>> https://inpn.mnhn.fr/espece/cd_nom/60878?lg=en. This >>>>>>>>>>> page could come with a JSON-LD desciption looking like >>>>>>>>>>> this: >>>>>>>>>>> https://github.com/frmichel/taxref-ld/blob/master/bioschemas-org-example.json >>>>>>>>>>> This example is naive and very succinct, and there are >>>>>>>>>>> lots of things to discuss and decide. Besides, I've just >>>>>>>>>>> registered on the mailing yesterday, so it may not fit >>>>>>>>>>> with good practices that you guys have already agreed >>>>>>>>>>> upon. Sorry if this is the case. Nevertheless, my point >>>>>>>>>>> is basically to bootstrap the discussion and see if the >>>>>>>>>>> community is willing to endorse this initiative. If this >>>>>>>>>>> is the case, we should probably involve people from the >>>>>>>>>>> biodiversity community: Darwin Core experts, EoL/GBIF >>>>>>>>>>> representatives etc. But that will come in time. >>>>>>>>>>> >>>>>>>>>>> I look forward to further discussions. >>>>>>>>>>> Regards, >>>>>>>>>>> Franck. >>>>>>>>>>> >>>>>>>>>>> [1] Michel F., Gargominy O., Tercerie S. & Faron-Zucker >>>>>>>>>>> C. (2017). A Model to Represent Nomenclatural and >>>>>>>>>>> Taxonomic Information as Linked Data. Application to the >>>>>>>>>>> French Taxonomic Register, TAXREF. In Proceedings of the >>>>>>>>>>> 2nd International Workshop on Semantics for Biodiversity >>>>>>>>>>> (S4BioDiv) co-located with ISWC 2017 vol. 1933. Vienna, >>>>>>>>>>> Austria. CEUR. >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> >>>>>>>>>>> Franck MICHEL >>>>>>>>>>> CNRS research engineer >>>>>>>>>>> +33 (0)492 96 5004 >>>>>>>>>>> franck.michel@cnrs.fr <mailto:franck.michel@cnrs.fr> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Université Côte d’Azur, CNRS, *Inria* - I3S - UMR 7271 >>>>>>>>>>> 930 route des Colles - Bât. Les Templiers >>>>>>>>>>> BP 145 - 06903 Sophia Antipolis CEDEX - France >>>>>>>>>>> Tel. +33 (0)4 9294 2680 >>>>>>>>>>> <tel:+33%204%2092%2094%2026%2080>, Fax : +33 (0)4 9294 2898 >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >> -- >> >> Franck MICHEL >> CNRS research engineer >> +33 (0)4 8915 4277 >> franck.michel@cnrs.fr <mailto:franck.michel@cnrs.fr> >> >> >> >> Université Côte d’Azur, CNRS- I3S - UMR 7271 >> 930 route des Colles >> <https://maps.google.com/?q=930+route+des+Colles&entry=gmail&source=g> >> - Bât. Les Templiers >> BP 145 - 06903 Sophia Antipolis CEDEX - France >> Tel. +33 (0)4 9294 2680 <tel:+33%204%2092%2094%2026%2080> >> >
Received on Tuesday, 12 June 2018 09:02:35 UTC