- From: Franck Michel <franck.michel@cnrs.fr>
- Date: Thu, 7 Jun 2018 10:52:43 +0200
- To: Leyla Garcia <ljgarcia@ebi.ac.uk>, public-bioschemas@w3.org, "Gray, Alasdair J G" <A.J.G.Gray@hw.ac.uk>, "Rafael C. Jimenez" <rafael.jimenez@elixir-europe.org>, "Carole Goble (carole.goble@manchester.ac.uk)" <carole.goble@manchester.ac.uk>
- Message-ID: <1e94f033-bac1-3052-56e2-8803174558fa@cnrs.fr>
Hi all, I'm catching up with the discussions on the list, and I'm happy to see that things are moving on with the submission of new types to schema.org. At the same time, I realize that we did not really go ahead about the biodiversity topic. As I will present a poster about Bioschemas.org at the Biodiversity Information Standard in August, that would maybe be a good thing to initiate the work on this by this date. How do we go on? I suggested the creation of a a Taxon profile, but we may have to start with the creation of a group? Could you please guide me/us in this process? Thx, Franck. Le 23/01/2018 à 11:09, Leyla Garcia a écrit : > Hello Bioschemas governance team, > > What do you think about going ahead with the Biodiversity schemas? Do > we have a heads up? > > @Franck, I am not really aware of those organizations but I am happy > to guide you through the work we have done for Bioschemas so far. I > worked a bit on a biodiversity project but that was some years ago. > Still, I like the subject! > > Let's wait to see what Carole, Rafael and Alasdair suggest. > > Regards, > > On 23/01/2018 08:47, Franck Michel wrote: >> Dear Leyla and all, >> >> I understand that your response stands for a GO. Right? >> >> I've not been involved yet in the specification of the Bioschemas.org >> profiles. So indeed, I shall need help and guidance as to how things >> are going on, the tools, the process, the expected outcomes, etc. >> >> As I proposed, we could start with contacting people that would >> potentially be interested in taking part into this. I'm thinking >> about Encyclopedia of Life, Catalogue of Life, GBIF. If you already >> know contacts in these organizations, that would certainly be helpful. >> >> Franck. >> >> Le 22/01/2018 à 11:37, Leyla Garcia a écrit : >>> Hi Franck, >>> >>> Great news! >>> >>> Do you need any help/guides for the start-up? >>> >>> Cheers, >>> >>> >>> On 17/01/2018 15:24, Franck Michel wrote: >>>> Dear all, >>>> >>>> I'm following up on this suggestion about creating a >>>> biodiversity-related group in Bioschemas.org. >>>> >>>> The proposition received four +1's. I'm not sure if there is a >>>> "minimum score" to attest of sufficient consensus. >>>> >>>> As we discussed, if we go for the creation of this group, it would >>>> be beneficial to involve at least EoL folks, possibly other people >>>> from the biodiversity community. I can try to initiate this, yet >>>> before I would like to have an official GO from our community. >>>> >>>> Let me know how this usually works, and what you think about this. >>>> >>>> Regards, >>>> Franck. >>>> >>>> Le 17/11/2017 à 16:40, Franck Michel a écrit : >>>>> Hi Mélanie, hi all, >>>>> >>>>> To go a bit further I've tried to somewhat extend the example I've >>>>> initiated. There it is: >>>>> https://github.com/frmichel/taxref-ld/tree/master/bioschemas-org >>>>> The README gives details as to how the example file is organized, >>>>> and more importantly it lists some of the issues and questions >>>>> that we shall have to tackle if we officially start the group. >>>>> >>>>> @Alasdair, Carole, Rafael: as discussed in the thread, at some >>>>> point it shall be beneficial to to invite people from EoL and >>>>> TDWG. Is there some sort of "official" channel for the community >>>>> to do that? >>>>> >>>>> Have a nice week-end, >>>>> Franck. >>>>> >>>>> Le 17/11/2017 à 10:19, Melanie Courtot a écrit : >>>>>> Hi Frank, all, >>>>>> >>>>>> On 16/11/2017 09:37, Franck Michel wrote: >>>>>>> Hi Meanie, hi all, >>>>>>> >>>>>>> EoL provides an API that returns species descriptions as JSON-LD >>>>>>> based on schemas.org. Beluga example: >>>>>>> http://eol.org/api/traits/328541 >>>>>>> It is unclear who consumes this data, but at least, as you >>>>>>> already saw, they embed it at the end of their own web pages >>>>>>> such as http://eol.org/pages/328541/data. >>>>>> BioSamples does the same - an API to retrieve JSON and we embed >>>>>> it in our webpages for crawler as well. >>>>>>> >>>>>>> As you also noticed, the JSON-LD they provide is not valid. I >>>>>>> didn't know about that EOL Github issue, but I recently >>>>>>> discussed it with Rod Page from the Biodiversity Information >>>>>>> Standards (aka TDWG), who replied on the Github issue. The >>>>>>> Google structured data testing tool gives more details on that: >>>>>>> https://frama.link/xJm0AAto >>>>>>> Besides, other errors are not reported (well, I think these are >>>>>>> errors): property scienfiticName without any namespace is >>>>>>> invalid, that should be dwc:scientificName since this does not >>>>>>> exist in schema.org. Same issue for vernacularName, traits, units... >>>>>>> >>>>>>> But whatever, this JSON-LD has lots of issues, but it's a start. >>>>>> >>>>>> Yes. Only mentioned the tweaks in case someone wanted to give it >>>>>> a try as well. >>>>>> >>>>>>> The assumption is that there is some sort of specific >>>>>>> (one-to-one) agreement between EoL and Google, and that Google >>>>>>> harvests this data despite the invalid JSON-LD. But I have no >>>>>>> confirmation of that >>>>>> >>>>>> It'd be interesting to clarify this. It seems a little bit >>>>>> counter intuitive that EoL would mark their pages up with JSON >>>>>> for Google to read it but then Google couldn't do so without a >>>>>> special adapter? We're probably missing a piece of the story. >>>>>>> >>>>>>> > - the measurement type points to >>>>>>> http://purl.obolibrary.org/obo/VT_0001256, which is body length. >>>>>>> The schema.org/predicate value is also "body length (VT)". How >>>>>>> is this understood and displayed as Length on the Google result? >>>>>>> - Similar question for the actual value and units, which are >>>>>>> "4249.83" and "mm" respectively. Is Google doing some sort of >>>>>>> unit conversion/roundup for display? >>>>>>> >>>>>>> Good question. Typically about the unit "mm": >>>>>>> - "units": "mm" => there is no such thing as http://schema.org/units >>>>>>> - "dwc:measurementUnit": >>>>>>> "http://purl.obolibrary.org/obo/UO_0000016" => this seems to be >>>>>>> the only reliable property, but then Google knows the Darwin >>>>>>> Core vocabulary and interprets it. >>>>>>> My assumption is that Google performs some treatment on the >>>>>>> values. Possibly, they developed a specific connector to cope >>>>>>> with EoL JSON-LD and translate this body size to "4.2 m". >>>>>>> Besides, the snippet mentions "4.2 m *(Adult)*", so they also >>>>>>> presumably consider this property: >>>>>>> eol:traitUri"http://eol.org/resources/704/measurements/adultheadbodylen27" >>>>>>> to know that this is the size of an adult. >>>>>>> >>>>>>> With proper Bioschemas.org profiles, I think we could annotate >>>>>>> pages from many other institutions, such as the Beluga page >>>>>>> <https://inpn.mnhn.fr/espece/cd_nom/60932?lg%3Den> on the french >>>>>>> National Museum of Natural History, and in turn, enable search >>>>>>> engines to harvest data from complimentary pages and produce >>>>>>> mashups of related pages, etc. >>>>>> That sounds like a great idea and entirely within the scope of >>>>>> Bioschemas. >>>>>>> >>>>>>> At this point, I think we should involve people from EoL, and >>>>>>> from the TDWG community (Rod Page would certainly be of great >>>>>>> added value in this respect). What do you think? Is there a >>>>>>> procedure for inviting people "officially"? >>>>>> I think we could benefit from their experience indeed; it seems >>>>>> they were able to deploy markup, add additional properties and >>>>>> then get this to be interpreted by Google which seems to match >>>>>> our use case pretty well! >>>>>> I +1'd the issue at >>>>>> https://github.com/BioSchemas/specifications/issues/115 >>>>>> >>>>>> Cheers, >>>>>> Melanie >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> Franck. >>>>>>> >>>>>>> >>>>>>> Le 15/11/2017 à 17:57, Melanie Courtot a écrit : >>>>>>>> Hi Frank, >>>>>>>> >>>>>>>> This looks really interesting, thanks for bringing it up. I was >>>>>>>> trying to find out how the interaction between EoL and >>>>>>>> schema.org was working and am wondering if you (or someone >>>>>>>> else!) could shed some light on this? >>>>>>>> >>>>>>>> As you suggested in the below, I checked the google beluga >>>>>>>> <https://www.google.fr/search?dcr=0&ei=ml74WajPMMzWUabjqvAF&q=beluga&oq=beluga&gs_l=psy-ab.3...19519.20929.0.20945.6.3.0.0.0.0.93.93.1.1.0....0...1.1.64.psy-ab..5.1.92...0j0i131k1.0.AGNziTItYzc> >>>>>>>> search result and do see the line "Length: 4.2 m (Adult) >>>>>>>> Encyclopedia of Life" >>>>>>>> >>>>>>>> If I try to find where that info comes from, and head to EoL, I >>>>>>>> can reach the page http://eol.org/pages/328541/overview, and >>>>>>>> follow the "see all traits" link to >>>>>>>> http://eol.org/pages/328541/data which contains the JSON-LD. >>>>>>>> >>>>>>>> I trimmed it down to extract the relevant bit, updated the id >>>>>>>> to be a string as per https://github.com/EOL/tramea/issues/352, >>>>>>>> and pasted it in the JSON playground mostly to make sure it was >>>>>>>> working as expected: http://tinyurl.com/yadam6nj >>>>>>>> >>>>>>>> I am missing the link of how the following happens: >>>>>>>> - the measurement type points to >>>>>>>> http://purl.obolibrary.org/obo/VT_0001256, which is body >>>>>>>> length. The schema.org/predicate value is also "body length >>>>>>>> (VT)". How is this understood and displayed as Length on the >>>>>>>> Google result? >>>>>>>> - Similar question for the actual value and units, which are >>>>>>>> "4249.83" and "mm" respectively. Is Google doing some sort of >>>>>>>> unit conversion/roundup for display? >>>>>>>> - Trophic level on EoL is "carnivore", but Google displays >>>>>>>> "Carnivorous" >>>>>>>> etc >>>>>>>> >>>>>>>> Or am I looking at the wrong source for the markup? >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Melanie >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 10/11/2017 15:17, Franck Michel wrote: >>>>>>>>> Dear all, >>>>>>>>> >>>>>>>>> I've just joined the Bioschemas.org community following some >>>>>>>>> discussions I had with Alasdair Gray whom I met at ISWC in >>>>>>>>> Vienna, and I'd like to start a new discussion thread. >>>>>>>>> >>>>>>>>> So, just to start, a few words about me. I'm a CNRS research >>>>>>>>> engineer, I work at the I3S laboratory in France, in >>>>>>>>> particular with the Wimmics research team led by Fabien >>>>>>>>> Gandon. I'm currently involved in some activities related to >>>>>>>>> the publication of taxonomic information as Linked Data [1]. >>>>>>>>> In this context, I've met the Biodiversity Information >>>>>>>>> Standards community (TDWG) that is increasingly considering SW >>>>>>>>> standards, LD publication and web pages markup. This is a >>>>>>>>> domain where, I think, it would be relevant for >>>>>>>>> Bioschemas.orgto get involved. >>>>>>>>> >>>>>>>>> There exist lots of web portals reporting observations, traits >>>>>>>>> and other data about all sorts of living organisms. >>>>>>>>> Encyclopedia of Life <http://eol.org/> (EoL) and the Global >>>>>>>>> Biodiversity Information Facility <https://www.gbif.org/> >>>>>>>>> (GBIF) are some of the most well known. Markup questions are >>>>>>>>> actively considered in this field, for instance EoL web pages >>>>>>>>> embed schemas.org-based JSON-LD descriptions that Google >>>>>>>>> leverages to enrich their snippets: e.g. if you google beluga >>>>>>>>> <https://www.google.fr/search?dcr=0&ei=ml74WajPMMzWUabjqvAF&q=beluga&oq=beluga&gs_l=psy-ab.3...19519.20929.0.20945.6.3.0.0.0.0.93.93.1.1.0....0...1.1.64.psy-ab..5.1.92...0j0i131k1.0.AGNziTItYzc> >>>>>>>>> you shall see 'Encyclopedia of Life' mentions in the snippet >>>>>>>>> providing average weight and size data. For now, this seems to >>>>>>>>> be an "individual" initiative between EoL and >>>>>>>>> Google/schemas.org, but it would make sense if this was part >>>>>>>>> of a broader reflection led by Bioschemas.org. >>>>>>>>> >>>>>>>>> My opinion is that fostering the use of common markup by these >>>>>>>>> portals could be very effective in helping the biodiversity >>>>>>>>> community to discover information and figure out new data >>>>>>>>> integration scenarios.Within Bioschemas.org, we could define >>>>>>>>> profiles to account for biodiversity-related >>>>>>>>> information.Taxonomic registers are used as the backbone of >>>>>>>>> many web portals, apps and databases related to biodiversity, >>>>>>>>> agronomy and agriculture.For instance, EoL and GBIF both rely >>>>>>>>> on the Catalog of Life <http://www.catalogueoflife.org/> >>>>>>>>> taxonomy. Therefore, we could start with the definition of a >>>>>>>>> profile to describe a taxon and the related scientific and >>>>>>>>> vernacular names thereof. Then, this could be extended with >>>>>>>>> the representation of traits (characteristics of biological >>>>>>>>> organisms), observations, occurrence data, conservation status >>>>>>>>> (e.g. endangered) etc. There already exist vocabularies for >>>>>>>>> such data such as the well-adopted Darwin Core terms. >>>>>>>>> >>>>>>>>> As a quick example, consider the web page describing the >>>>>>>>> common dolphin on the web site of the french Museum of Natural >>>>>>>>> History: https://inpn.mnhn.fr/espece/cd_nom/60878?lg=en. This >>>>>>>>> page could come with a JSON-LD desciption looking like this: >>>>>>>>> https://github.com/frmichel/taxref-ld/blob/master/bioschemas-org-example.json >>>>>>>>> This example is naive and very succinct, and there are lots of >>>>>>>>> things to discuss and decide. Besides, I've just registered on >>>>>>>>> the mailing yesterday, so it may not fit with good practices >>>>>>>>> that you guys have already agreed upon. Sorry if this is the >>>>>>>>> case. Nevertheless, my point is basically to bootstrap the >>>>>>>>> discussion and see if the community is willing to endorse this >>>>>>>>> initiative. If this is the case, we should probably involve >>>>>>>>> people from the biodiversity community: Darwin Core experts, >>>>>>>>> EoL/GBIF representatives etc. But that will come in time. >>>>>>>>> >>>>>>>>> I look forward to further discussions. >>>>>>>>> Regards, >>>>>>>>> Franck. >>>>>>>>> >>>>>>>>> [1] Michel F., Gargominy O., Tercerie S. & Faron-Zucker C. >>>>>>>>> (2017). A Model to Represent Nomenclatural and Taxonomic >>>>>>>>> Information as Linked Data. Application to the French >>>>>>>>> Taxonomic Register, TAXREF. In Proceedings of the 2nd >>>>>>>>> International Workshop on Semantics for Biodiversity >>>>>>>>> (S4BioDiv) co-located with ISWC 2017 vol. 1933. Vienna, >>>>>>>>> Austria. CEUR. >>>>>>>>> >>>>>>>>> -- >>>>>>>>> signature >>>>>>>>> >>>>>>>>> Franck MICHEL >>>>>>>>> CNRS research engineer >>>>>>>>> +33 (0)492 96 5004 >>>>>>>>> franck.michel@cnrs.fr <mailto:franck.michel@cnrs.fr> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Université Côte d’Azur, CNRS, *Inria* - I3S - UMR 7271 >>>>>>>>> 930 route des Colles - Bât. Les Templiers >>>>>>>>> BP 145 - 06903 Sophia Antipolis CEDEX - France >>>>>>>>> Tel. +33 (0)4 9294 2680, Fax : +33 (0)4 9294 2898 >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > -- signature Franck MICHEL CNRS research engineer +33 (0)4 8915 4277 franck.michel@cnrs.fr <mailto:franck.michel@cnrs.fr> Université Côte d’Azur, CNRS- I3S - UMR 7271 930 route des Colles - Bât. Les Templiers BP 145 - 06903 Sophia Antipolis CEDEX - France Tel. +33 (0)4 9294 2680
Received on Thursday, 7 June 2018 08:53:16 UTC