W3C home > Mailing lists > Public > public-bioschemas@w3.org > June 2018

Re: Bioschemas.org to define biodiversity-related markup

From: Ricardo Arcila <ricartomojo@gmail.com>
Date: Mon, 11 Jun 2018 14:39:53 +0100
Message-ID: <CACp2UZjGKzc5jkFWdQ87UdG1XTKewnvMk3wa1NY3QPKhWW6_rA@mail.gmail.com>
To: Franck Michel <franck.michel@cnrs.fr>
Cc: Leyla Garcia <ljgarcia@ebi.ac.uk>, public-bioschemas@w3.org, "Gray, Alasdair J G" <A.J.G.Gray@hw.ac.uk>, "Rafael C. Jimenez" <rafael.jimenez@elixir-europe.org>, "Carole Goble (carole.goble@manchester.ac.uk)" <carole.goble@manchester.ac.uk>
Hello Franck,

It is a good idea to start by creating the group. You can do it by creating
a pull request on the bioschemas groups repository
<https://github.com/BioSchemas/bioschemas.github.io/tree/master/_groups>.
Then you can add yourself on the people repository
<https://github.com/BioSchemas/bioschemas.github.io/tree/master/_people>. I
will be happy to help you in this process and if you'd like I could be part
of the group as well.

In order to start a draft specification for Taxon you should create a
folder with the profile name on the specifications drive folder
<https://drive.google.com/drive/folders/0Bw_p-HKWUjHoNThZOWNKbGhOODg?usp=sharing>.
This process its detailed on the bioschemas github wiki
<https://github.com/BioSchemas/specifications/wiki/Bioschemas-Specification-Process>
.

Please let me know if you have any question or doubt about the process, I
will be most happy to help.


Best regards,
Ricardo Arcila


On Thu, Jun 7, 2018 at 9:54 AM Franck Michel <franck.michel@cnrs.fr> wrote:

> Hi all,
>
> I'm catching up with the discussions on the list, and I'm happy to see
> that things are moving on with the submission of new types to schema.org.
>
> At the same time, I realize that we did not really go ahead about the
> biodiversity topic. As I will present a poster about Bioschemas.org at the
> Biodiversity Information Standard in August, that would maybe be a good
> thing to initiate the work on this by this date. How do we go on? I
> suggested the creation of a a Taxon profile, but we may have to start with
> the creation of a group?
> Could you please guide me/us in this process?
>
> Thx,
>     Franck.
>
> Le 23/01/2018 à 11:09, Leyla Garcia a écrit :
>
> Hello Bioschemas governance team,
>
> What do you think about going ahead with the Biodiversity schemas? Do we
> have a heads up?
>
> @Franck, I am not really aware of those organizations but I am happy to
> guide you through the work we have done for Bioschemas so far. I worked a
> bit on a biodiversity project but that was some years ago. Still, I like
> the subject!
>
> Let's wait to see what Carole, Rafael and Alasdair suggest.
>
> Regards,
>
> On 23/01/2018 08:47, Franck Michel wrote:
>
> Dear Leyla and all,
>
> I understand that your response stands for a GO. Right?
>
> I've not been involved yet in the specification of the Bioschemas.org
> profiles. So indeed, I shall need help and guidance as to how things are
> going on, the tools, the process, the expected outcomes, etc.
>
> As I proposed, we could start with contacting people that would
> potentially be interested in taking part into this. I'm thinking about
> Encyclopedia of Life, Catalogue of Life, GBIF. If you already know contacts
> in these organizations, that would certainly be helpful.
>
> Franck.
>
> Le 22/01/2018 à 11:37, Leyla Garcia a écrit :
>
> Hi Franck,
>
> Great news!
>
> Do you need any help/guides for the start-up?
>
> Cheers,
>
>
> On 17/01/2018 15:24, Franck Michel wrote:
>
> Dear all,
>
> I'm following up on this suggestion about creating a biodiversity-related
> group in Bioschemas.org.
>
> The proposition received four +1's. I'm not sure if there is a "minimum
> score" to attest of sufficient consensus.
>
> As we discussed, if we go for the creation of this group, it would be
> beneficial to involve at least EoL folks, possibly other people from the
> biodiversity community. I can try to initiate this, yet before I would like
> to have an official GO from our community.
>
> Let me know how this usually works, and what you think about this.
>
> Regards,
>     Franck.
>
> Le 17/11/2017 à 16:40, Franck Michel a écrit :
>
> Hi Mélanie, hi all,
>
> To go a bit further I've tried to somewhat extend the example I've
> initiated. There it is:
> https://github.com/frmichel/taxref-ld/tree/master/bioschemas-org
> The README gives details as to how the example file is organized, and more
> importantly it lists some of the issues and questions that we shall have to
> tackle if we officially start the group.
>
> @Alasdair, Carole, Rafael: as discussed in the thread, at some point it
> shall be beneficial to to invite people from EoL and TDWG. Is there some
> sort of "official" channel for the community to do that?
>
> Have a nice week-end,
>     Franck.
>
> Le 17/11/2017 à 10:19, Melanie Courtot a écrit :
>
> Hi Frank, all,
>
> On 16/11/2017 09:37, Franck Michel wrote:
>
> Hi Meanie, hi all,
>
> EoL provides an API that returns species descriptions as JSON-LD based on
> schemas.org. Beluga example: http://eol.org/api/traits/328541
> It is unclear who consumes this data, but at least, as you already saw,
> they embed it at the end of their own web pages such as
> http://eol.org/pages/328541/data.
>
> BioSamples does the same - an API to retrieve JSON and we embed it in our
> webpages for crawler as well.
>
>
> As you also noticed, the JSON-LD they provide is not valid. I didn't know
> about that EOL Github issue, but I recently discussed it with Rod Page from
> the Biodiversity Information Standards (aka TDWG), who replied on the
> Github issue. The Google structured data testing tool gives more details on
> that: https://frama.link/xJm0AAto
> Besides, other errors are not reported (well, I think these are errors):
> property scienfiticName without any namespace is invalid, that should be
> dwc:scientificName since this does not exist in schema.org. Same issue
> for vernacularName, traits, units...
>
> But whatever, this JSON-LD has lots of issues, but it's a start.
>
>
> Yes. Only mentioned the tweaks in case someone wanted to give it a try as
> well.
>
> The assumption is that there is some sort of specific (one-to-one)
> agreement between EoL and Google, and that Google harvests this data
> despite the invalid JSON-LD. But I have no confirmation of that
>
>
> It'd be interesting to clarify this. It seems a little bit counter
> intuitive that EoL would mark their pages up with JSON for Google to read
> it but then Google couldn't do so without a special adapter? We're probably
> missing a piece of the story.
>
>
> > - the measurement type points to
> http://purl.obolibrary.org/obo/VT_0001256, which is body length. The
> schema.org/predicate value is also "body length (VT)". How is this
> understood and displayed as Length on the Google result?
> - Similar question for the actual value and units, which are "4249.83" and
> "mm" respectively. Is Google doing some sort of unit conversion/roundup for
> display?
>
> Good question. Typically about the unit "mm":
> - "units": "mm" => there is no such thing as http://schema.org/units
> - "dwc:measurementUnit": "http://purl.obolibrary.org/obo/UO_0000016"
> <http://purl.obolibrary.org/obo/UO_0000016> => this seems to be the only
> reliable property, but then Google knows the Darwin Core vocabulary and
> interprets it.
> My assumption is that Google performs some treatment on the values.
> Possibly, they developed a specific connector to cope with EoL JSON-LD and
> translate this body size to "4.2 m".
> Besides, the snippet mentions "4.2 m *(Adult)*", so they also presumably
> consider this property:
>     eol:traitUri
> "http://eol.org/resources/704/measurements/adultheadbodylen27"
> <http://eol.org/resources/704/measurements/adultheadbodylen27>
> to know that this is the size of an adult.
>
> With proper Bioschemas.org profiles, I think we could annotate pages from
> many other institutions, such as the Beluga page
> <https://inpn.mnhn.fr/espece/cd_nom/60932?lg%3Den> on the french National
> Museum of Natural History, and in turn, enable search engines to harvest
> data from complimentary pages and produce mashups of related pages, etc.
>
> That sounds like a great idea and entirely within the scope of Bioschemas.
>
>
> At this point, I think we should involve people from EoL, and from the
> TDWG community (Rod Page would certainly be of great added value in this
> respect). What do you think? Is there a procedure for inviting people
> "officially"?
>
> I think we could benefit from their experience indeed; it seems they were
> able to deploy markup, add additional properties and then get this to be
> interpreted by Google which seems to match our use case pretty well!
> I +1'd the issue at
> https://github.com/BioSchemas/specifications/issues/115
>
> Cheers,
> Melanie
>
>
>
>
>
> Franck.
>
>
> Le 15/11/2017 à 17:57, Melanie Courtot a écrit :
>
> Hi Frank,
>
> This looks really interesting, thanks for bringing it up. I was trying to
> find out how the interaction between EoL and schema.org was working and
> am wondering if you (or someone else!) could shed some light on this?
>
> As you suggested in the below, I checked the google beluga
> <https://www.google.fr/search?dcr=0&ei=ml74WajPMMzWUabjqvAF&q=beluga&oq=beluga&gs_l=psy-ab.3...19519.20929.0.20945.6.3.0.0.0.0.93.93.1.1.0....0...1.1.64.psy-ab..5.1.92...0j0i131k1.0.AGNziTItYzc>
> search result and do see the line "Length: 4.2 m (Adult) Encyclopedia of
> Life"
>
> If I try to find where that info comes from, and head to EoL, I can reach
> the page http://eol.org/pages/328541/overview, and follow the "see all
> traits" link to http://eol.org/pages/328541/data which contains the
> JSON-LD.
>
> I trimmed it down to extract the relevant bit, updated the id to be a
> string as per https://github.com/EOL/tramea/issues/352, and pasted it in
> the JSON playground mostly to make sure it was working as expected:
> http://tinyurl.com/yadam6nj
>
> I am missing the link of how the following happens:
> - the measurement type points to http://purl.obolibrary.org/obo/VT_0001256,
> which is body length. The schema.org/predicate value is also "body length
> (VT)". How is this understood and displayed as Length on the Google result?
> - Similar question for the actual value and units, which are "4249.83" and
> "mm" respectively. Is Google doing some sort of unit conversion/roundup for
> display?
> - Trophic level on EoL is "carnivore", but Google displays "Carnivorous"
> etc
>
> Or am I looking at the wrong source for the markup?
>
> Cheers,
> Melanie
>
>
>
>
>
>
> On 10/11/2017 15:17, Franck Michel wrote:
>
> Dear all,
>
> I've just joined the Bioschemas.org community following some discussions I
> had with Alasdair Gray whom I met at ISWC in Vienna, and I'd like to
> start a new discussion thread.
>
> So, just to start, a few words about me. I'm a CNRS research engineer, I
> work at the I3S laboratory in France, in particular with the Wimmics
> research team led by Fabien Gandon. I'm currently involved in some
> activities related to the publication of taxonomic information as Linked
> Data [1]. In this context, I've met the Biodiversity Information Standards
> community (TDWG) that is increasingly considering SW standards, LD
> publication and web pages markup. This is a domain where, I think, it would
> be relevant for Bioschemas.org to get involved.
>
> There exist lots of web portals reporting observations, traits and other
> data about all sorts of living organisms. Encyclopedia of Life
> <http://eol.org/> (EoL) and the Global Biodiversity Information Facility
> <https://www.gbif.org/> (GBIF) are some of the most well known. Markup
> questions are actively considered in this field, for instance EoL web pages
> embed schemas.org-based JSON-LD descriptions that Google leverages to
> enrich their snippets: e.g. if you google beluga
> <https://www.google.fr/search?dcr=0&ei=ml74WajPMMzWUabjqvAF&q=beluga&oq=beluga&gs_l=psy-ab.3...19519.20929.0.20945.6.3.0.0.0.0.93.93.1.1.0....0...1.1.64.psy-ab..5.1.92...0j0i131k1.0.AGNziTItYzc>
> you shall see 'Encyclopedia of Life' mentions in the snippet providing
> average weight and size data. For now, this seems to be an "individual"
> initiative between EoL and Google/schemas.org, but it would make sense if
> this was part of a broader reflection led by Bioschemas.org.
>
> My opinion is that fostering the use of common markup by these portals could
> be very effective in helping the biodiversity community to discover
> information and figure out new data integration scenarios. Within Bioschemas.org,
> we could define profiles to account for biodiversity-related information.
> Taxonomic registers are used as the backbone of many web portals, apps and
> databases related to biodiversity, agronomy and agriculture. For
> instance, EoL and GBIF both rely on the Catalog of Life
> <http://www.catalogueoflife.org/> taxonomy. Therefore, we could start
> with the definition of a profile to describe a taxon and the related
> scientific and vernacular names thereof. Then, this could be extended with
> the representation of traits (characteristics of biological organisms),
> observations, occurrence data, conservation status (e.g. endangered) etc.
> There already exist vocabularies for such data such as the well-adopted
> Darwin Core terms.
>
> As a quick example, consider the web page describing the common dolphin
> on the web site of the french Museum of Natural History:
> https://inpn.mnhn.fr/espece/cd_nom/60878?lg=en. This page could come with
> a JSON-LD desciption looking like this:
> https://github.com/frmichel/taxref-ld/blob/master/bioschemas-org-example.json
> This example is naive and very succinct, and there are lots of things to
> discuss and decide. Besides, I've just registered on the mailing
> yesterday, so it may not fit with good practices that you guys have
> already agreed upon. Sorry if this is the case. Nevertheless, my point is
> basically to bootstrap the discussion and see if the community is willing
> to endorse this initiative. If this is the case, we should probably involve
> people from the biodiversity community: Darwin Core experts, EoL/GBIF
> representatives etc. But that will come in time.
>
> I look forward to further discussions.
> Regards,
>    Franck.
>
> [1] Michel F., Gargominy O., Tercerie S. & Faron-Zucker C. (2017). A Model
> to Represent Nomenclatural and Taxonomic Information as Linked Data.
> Application to the French Taxonomic Register, TAXREF. In Proceedings of the
> 2nd International Workshop on Semantics for Biodiversity (S4BioDiv)
> co-located with ISWC 2017 vol. 1933. Vienna, Austria. CEUR.
>
> --
>
> Franck MICHEL
> CNRS research engineer
> +33 (0)492 96 5004
> franck.michel@cnrs.fr
>
>
>
> Université Côte d’Azur, CNRS, *Inria* - I3S - UMR 7271
> 930 route des Colles - Bât. Les Templiers
> BP 145 - 06903 Sophia Antipolis CEDEX - France
> Tel. +33 (0)4 9294 2680 <+33%204%2092%2094%2026%2080>, Fax : +33 (0)4
> 9294 2898
>
>
>
>
>
>
>
>
>
>
> --
>
> Franck MICHEL
> CNRS research engineer
> +33 (0)4 8915 4277
> franck.michel@cnrs.fr
>
>
>
> Université Côte d’Azur, CNRS - I3S - UMR 7271
> 930 route des Colles
> <https://maps.google.com/?q=930+route+des+Colles&entry=gmail&source=g> -
> Bât. Les Templiers
> BP 145 - 06903 Sophia Antipolis CEDEX - France
> Tel. +33 (0)4 9294 2680 <+33%204%2092%2094%2026%2080>
>
Received on Monday, 11 June 2018 13:40:34 UTC

This archive was generated by hypermail 2.3.1 : Monday, 11 June 2018 13:40:35 UTC