- From: Carl Boettiger <cboettig@gmail.com>
- Date: Wed, 20 Jun 2018 15:06:13 -0700
- To: Franck Michel <fmichel@i3s.unice.fr>
- Cc: LJ Garcia Castro <ljgarcia@ebi.ac.uk>, "Gray, Alasdair J G" <A.J.G.Gray@hw.ac.uk>, Ricardo Arcila <arcila@ebi.ac.uk>, "public-bioschemas@w3.org" <public-bioschemas@w3.org>
- Message-ID: <CAN_1p9xQv3E8OABckO+VJ=qx4tbZ_Rcz-peKJQjEe34BaM1xjA@mail.gmail.com>
Hi Franck, I'm also very interested in support for biodiversity-related markup and happy to help out. The Doc in the google drive looks like a nice start, I've added a few comments on the side. If it's helpful, happy to try and solicit input on this from others in the biodiversity / taxonomic informatics space. e.g. I believe EOL is approaching a new release of their taxonomic data in JSON-LD markup, might be natural to check in with them as well. Cheers, Carl On Wed, Jun 20, 2018 at 2:48 PM Franck Michel <franck.michel@cnrs.fr> wrote: > Hi all, > > It seems like I've had email issues lately. I just discovered Ricardo's > and Alasdair's answers in the flow below. > > Also, I thought I had submitted a pull request for the creation of a > _groups/Biodiversity.md file that I had carefully written, but it never > reached out to Ricardo (and I can't find any trace of it on Gihub ;)). > Anyway, my idea was to create a Biodiversity group (instead of a Taxon > group), whose first task would be to define the Taxon profile. There may be > other profiles defined by this group later on. Are you ok with that? > > @Leyla: as a starting point, maybe we can interact through the discussion > document I associated with the mapping (in the Taxon folder > <https://drive.google.com/drive/u/0/folders/1Fp2AKbb07So7rVvUhnQIjpl8HLPSwpbP> > )? > > Franck. > > Le 20/06/2018 à 19:49, LJ Garcia Castro a écrit : > > Hi Franck, > > We associate proteins to taxa so I am happy to help. Please add me to the > loop and let us know what would be the best approach to contribute, i.e., > email, comments via gdrive, issues via github, etc. > > Regards, > > On 15/06/2018 13:10, Gray, Alasdair J G wrote: > > Hi All > > I’m happy for the taxon group to be created with Franck as the initial > group lead. Is there someone willing to support Franck in this role? > > Alasdair > > On 15 Jun 2018, at 12:58, Ricardo Arcila <arcila@ebi.ac.uk> wrote: > > Hello Franck, > > I have taken the liberty to create a branch > <https://github.com/BioSchemas/bioschemas.github.io/tree/ric/feat/taxons-group> with > the draft of the group Taxons, please feel free to adjust it as you see fit. > > Kind regards, > Ricardo > > On 12 Jun 2018, at 10:02, Franck Michel <franck.michel@cnrs.fr> wrote: > > Dear Ricardo and Leyla, > > I just made a pull request, and I created a Biodiversity specification > folder on Google drive. Let me know if anything is not right. I've set > myself as the group leader, but I would feel more comfortable if someone of > the community would join me in this role. And obviously, you are most > welcome to join the group! > > > will be Taxon a BioChemEntity? I am asking because in UniProt we have > proteins link to what is defined as an "unknown" taxon in NCBI > taxonomy/UniProt taxonmy. I guess, even if iwe have this "unknown" case, we > could still use BiochemEntity and suppose any "unknow" will be eventually > resolve to an actual entity. Happy to chat about it. > I agree, the large definition of BioChemEntity makes it appropriate as the > root of Taxon. So far, I think of Taxon as a profile more than a type of > its own. I'll read the wiki and start drafting something. I let you know if > (most probably when) I have any question. ;) > > Regards, > Franck. > > Le 11/06/2018 à 15:46, LJ Garcia Castro a écrit : > > Hello Franck, > > The taxon profile has been mentioned as one we need before but there was > no group for it. Wonderful you are starting one now! Please ask whenever > you have a doubt about the process or the different approaches (third-party > vocabs or additionalProperty) to deal with properties not covered by > BioChemEntity. > > By the way, will be Taxon a BioChemEntity? I am asking because in UniProt > we have proteins link to what is defined as an "unknown" taxon in NCBI > taxonomy/UniProt taxonmy. I guess, even if iwe have this "unknown" case, we > could still use BiochemEntity and suppose any "unknow" will be eventually > resolve to an actual entity. Happy to chat about it. > > Regards, > > > > On 11/06/2018 14:39, Ricardo Arcila wrote: > > Hello Franck, > > It is a good idea to start by creating the group. You can do it by > creating a pull request on the bioschemas groups repository > <https://github.com/BioSchemas/bioschemas.github.io/tree/master/_groups>. > Then you can add yourself on the people repository > <https://github.com/BioSchemas/bioschemas.github.io/tree/master/_people>. > I will be happy to help you in this process and if you'd like I could be > part of the group as well. > > In order to start a draft specification for Taxon you should create a > folder with the profile name on the specifications drive folder > <https://drive.google.com/drive/folders/0Bw_p-HKWUjHoNThZOWNKbGhOODg?usp=sharing>. > This process its detailed on the bioschemas github wiki > <https://github.com/BioSchemas/specifications/wiki/Bioschemas-Specification-Process> > . > > Please let me know if you have any question or doubt about the process, I > will be most happy to help. > > > Best regards, > Ricardo Arcila > > > On Thu, Jun 7, 2018 at 9:54 AM Franck Michel <franck.michel@cnrs.fr> > wrote: > >> Hi all, >> >> I'm catching up with the discussions on the list, and I'm happy to see >> that things are moving on with the submission of new types to schema.org. >> >> At the same time, I realize that we did not really go ahead about the >> biodiversity topic. As I will present a poster about Bioschemas.org >> <http://bioschemas.org/> at the Biodiversity Information Standard in >> August, that would maybe be a good thing to initiate the work on this by >> this date. How do we go on? I suggested the creation of a a Taxon profile, >> but we may have to start with the creation of a group? >> Could you please guide me/us in this process? >> >> Thx, >> Franck. >> >> Le 23/01/2018 à 11:09, Leyla Garcia a écrit : >> >> Hello Bioschemas governance team, >> >> What do you think about going ahead with the Biodiversity schemas? Do we >> have a heads up? >> >> @Franck, I am not really aware of those organizations but I am happy to >> guide you through the work we have done for Bioschemas so far. I worked a >> bit on a biodiversity project but that was some years ago. Still, I like >> the subject! >> >> Let's wait to see what Carole, Rafael and Alasdair suggest. >> >> Regards, >> >> On 23/01/2018 08:47, Franck Michel wrote: >> >> Dear Leyla and all, >> >> I understand that your response stands for a GO. Right? >> >> I've not been involved yet in the specification of the Bioschemas.org >> <http://bioschemas.org/> profiles. So indeed, I shall need help and >> guidance as to how things are going on, the tools, the process, the >> expected outcomes, etc. >> >> As I proposed, we could start with contacting people that would >> potentially be interested in taking part into this. I'm thinking about >> Encyclopedia of Life, Catalogue of Life, GBIF. If you already know contacts >> in these organizations, that would certainly be helpful. >> >> Franck. >> >> Le 22/01/2018 à 11:37, Leyla Garcia a écrit : >> >> Hi Franck, >> >> Great news! >> >> Do you need any help/guides for the start-up? >> >> Cheers, >> >> >> On 17/01/2018 15:24, Franck Michel wrote: >> >> Dear all, >> >> I'm following up on this suggestion about creating a biodiversity-related >> group in Bioschemas.org <http://bioschemas.org/>. >> >> The proposition received four +1's. I'm not sure if there is a "minimum >> score" to attest of sufficient consensus. >> >> As we discussed, if we go for the creation of this group, it would be >> beneficial to involve at least EoL folks, possibly other people from the >> biodiversity community. I can try to initiate this, yet before I would like >> to have an official GO from our community. >> >> Let me know how this usually works, and what you think about this. >> >> Regards, >> Franck. >> >> Le 17/11/2017 à 16:40, Franck Michel a écrit : >> >> Hi Mélanie, hi all, >> >> To go a bit further I've tried to somewhat extend the example I've >> initiated. There it is: >> https://github.com/frmichel/taxref-ld/tree/master/bioschemas-org >> The README gives details as to how the example file is organized, and >> more importantly it lists some of the issues and questions that we shall >> have to tackle if we officially start the group. >> >> @Alasdair, Carole, Rafael: as discussed in the thread, at some point it >> shall be beneficial to to invite people from EoL and TDWG. Is there some >> sort of "official" channel for the community to do that? >> >> Have a nice week-end, >> Franck. >> >> Le 17/11/2017 à 10:19, Melanie Courtot a écrit : >> >> Hi Frank, all, >> >> On 16/11/2017 09:37, Franck Michel wrote: >> >> Hi Meanie, hi all, >> >> EoL provides an API that returns species descriptions as JSON-LD based on >> schemas.org. Beluga example: http://eol.org/api/traits/328541 >> It is unclear who consumes this data, but at least, as you already saw, >> they embed it at the end of their own web pages such as >> http://eol.org/pages/328541/data. >> >> BioSamples does the same - an API to retrieve JSON and we embed it in our >> webpages for crawler as well. >> >> >> As you also noticed, the JSON-LD they provide is not valid. I didn't know >> about that EOL Github issue, but I recently discussed it with Rod Page from >> the Biodiversity Information Standards (aka TDWG), who replied on the >> Github issue. The Google structured data testing tool gives more details on >> that: https://frama.link/xJm0AAto >> Besides, other errors are not reported (well, I think these are errors): >> property scienfiticName without any namespace is invalid, that should be >> dwc:scientificName since this does not exist in schema.org. Same issue >> for vernacularName, traits, units... >> >> But whatever, this JSON-LD has lots of issues, but it's a start. >> >> >> Yes. Only mentioned the tweaks in case someone wanted to give it a try as >> well. >> >> The assumption is that there is some sort of specific (one-to-one) >> agreement between EoL and Google, and that Google harvests this data >> despite the invalid JSON-LD. But I have no confirmation of that >> >> >> It'd be interesting to clarify this. It seems a little bit counter >> intuitive that EoL would mark their pages up with JSON for Google to read >> it but then Google couldn't do so without a special adapter? We're probably >> missing a piece of the story. >> >> >> > - the measurement type points to >> http://purl.obolibrary.org/obo/VT_0001256, which is body length. The >> schema.org/predicate value is also "body length (VT)". How is this >> understood and displayed as Length on the Google result? >> - Similar question for the actual value and units, which are "4249.83" >> and "mm" respectively. Is Google doing some sort of unit conversion/roundup >> for display? >> >> Good question. Typically about the unit "mm": >> - "units": "mm" => there is no such thing as http://schema.org/units >> - "dwc:measurementUnit": "http://purl.obolibrary.org/obo/UO_0000016" >> <http://purl.obolibrary.org/obo/UO_0000016> => this seems to be the only >> reliable property, but then Google knows the Darwin Core vocabulary and >> interprets it. >> My assumption is that Google performs some treatment on the values. >> Possibly, they developed a specific connector to cope with EoL JSON-LD and >> translate this body size to "4.2 m". >> Besides, the snippet mentions "4.2 m *(Adult)*", so they also presumably >> consider this property: >> eol:traitUri >> "http://eol.org/resources/704/measurements/adultheadbodylen27" >> <http://eol.org/resources/704/measurements/adultheadbodylen27> >> to know that this is the size of an adult. >> >> With proper Bioschemas.org <http://bioschemas.org/> profiles, I think we >> could annotate pages from many other institutions, such as the Beluga >> page <https://inpn.mnhn.fr/espece/cd_nom/60932?lg%3Den> on the french >> National Museum of Natural History, and in turn, enable search engines to >> harvest data from complimentary pages and produce mashups of related pages, >> etc. >> >> That sounds like a great idea and entirely within the scope of Bioschemas. >> >> >> At this point, I think we should involve people from EoL, and from the >> TDWG community (Rod Page would certainly be of great added value in this >> respect). What do you think? Is there a procedure for inviting people >> "officially"? >> >> I think we could benefit from their experience indeed; it seems they were >> able to deploy markup, add additional properties and then get this to be >> interpreted by Google which seems to match our use case pretty well! >> I +1'd the issue at >> https://github.com/BioSchemas/specifications/issues/115 >> >> Cheers, >> Melanie >> >> >> >> >> >> Franck. >> >> >> Le 15/11/2017 à 17:57, Melanie Courtot a écrit : >> >> Hi Frank, >> >> This looks really interesting, thanks for bringing it up. I was trying to >> find out how the interaction between EoL and schema.org was working and >> am wondering if you (or someone else!) could shed some light on this? >> >> As you suggested in the below, I checked the google beluga >> <https://www.google.fr/search?dcr=0&ei=ml74WajPMMzWUabjqvAF&q=beluga&oq=beluga&gs_l=psy-ab.3...19519.20929.0.20945.6.3.0.0.0.0.93.93.1.1.0....0...1.1.64.psy-ab..5.1.92...0j0i131k1.0.AGNziTItYzc> >> search result and do see the line "Length: 4.2 m (Adult) Encyclopedia of >> Life" >> >> If I try to find where that info comes from, and head to EoL, I can reach >> the page http://eol.org/pages/328541/overview, and follow the "see all >> traits" link to http://eol.org/pages/328541/data which contains the >> JSON-LD. >> >> I trimmed it down to extract the relevant bit, updated the id to be a >> string as per https://github.com/EOL/tramea/issues/352, and pasted it in >> the JSON playground mostly to make sure it was working as expected: >> http://tinyurl.com/yadam6nj >> >> I am missing the link of how the following happens: >> - the measurement type points to >> http://purl.obolibrary.org/obo/VT_0001256, which is body length. The >> schema.org/predicate value is also "body length (VT)". How is this >> understood and displayed as Length on the Google result? >> - Similar question for the actual value and units, which are "4249.83" >> and "mm" respectively. Is Google doing some sort of unit conversion/roundup >> for display? >> - Trophic level on EoL is "carnivore", but Google displays "Carnivorous" >> etc >> >> Or am I looking at the wrong source for the markup? >> >> Cheers, >> Melanie >> >> >> >> >> >> >> On 10/11/2017 15:17, Franck Michel wrote: >> >> Dear all, >> >> I've just joined the Bioschemas.org <http://bioschemas.org/> community >> following some discussions I had with Alasdair Gray whom I met at ISWC in >> Vienna, and I'd like to start a new discussion thread. >> >> So, just to start, a few words about me. I'm a CNRS research engineer, I >> work at the I3S laboratory in France, in particular with the Wimmics >> research team led by Fabien Gandon. I'm currently involved in some >> activities related to the publication of taxonomic information as Linked >> Data [1]. In this context, I've met the Biodiversity Information Standards >> community (TDWG) that is increasingly considering SW standards, LD >> publication and web pages markup. This is a domain where, I think, it would >> be relevant for Bioschemas.org <http://bioschemas.org/> to get involved. >> >> There exist lots of web portals reporting observations, traits and other >> data about all sorts of living organisms. Encyclopedia of Life >> <http://eol.org/> (EoL) and the Global Biodiversity Information Facility >> <https://www.gbif.org/> (GBIF) are some of the most well known. Markup >> questions are actively considered in this field, for instance EoL web pages >> embed schemas.org-based JSON-LD descriptions that Google leverages to >> enrich their snippets: e.g. if you google beluga >> <https://www.google.fr/search?dcr=0&ei=ml74WajPMMzWUabjqvAF&q=beluga&oq=beluga&gs_l=psy-ab.3...19519.20929.0.20945.6.3.0.0.0.0.93.93.1.1.0....0...1.1.64.psy-ab..5.1.92...0j0i131k1.0.AGNziTItYzc> >> you shall see 'Encyclopedia of Life' mentions in the snippet providing >> average weight and size data. For now, this seems to be an "individual" >> initiative between EoL and Google/schemas.org, but it would make sense >> if this was part of a broader reflection led by Bioschemas.org >> <http://bioschemas.org/>. >> >> My opinion is that fostering the use of common markup by these portals could >> be very effective in helping the biodiversity community to discover >> information and figure out new data integration scenarios. Within >> Bioschemas.org <http://bioschemas.org/>, we could define profiles to >> account for biodiversity-related information. Taxonomic registers are >> used as the backbone of many web portals, apps and databases related to >> biodiversity, agronomy and agriculture. For instance, EoL and GBIF both >> rely on the Catalog of Life <http://www.catalogueoflife.org/> taxonomy. >> Therefore, we could start with the definition of a profile to describe a >> taxon and the related scientific and vernacular names thereof. Then, this >> could be extended with the representation of traits (characteristics of >> biological organisms), observations, occurrence data, conservation status >> (e.g. endangered) etc. There already exist vocabularies for such data such >> as the well-adopted Darwin Core terms. >> >> As a quick example, consider the web page describing the common dolphin >> on the web site of the french Museum of Natural History: >> https://inpn.mnhn.fr/espece/cd_nom/60878?lg=en. This page could come >> with a JSON-LD desciption looking like this: >> https://github.com/frmichel/taxref-ld/blob/master/bioschemas-org-example.json >> This example is naive and very succinct, and there are lots of things to >> discuss and decide. Besides, I've just registered on the mailing >> yesterday, so it may not fit with good practices that you guys have >> already agreed upon. Sorry if this is the case. Nevertheless, my point >> is basically to bootstrap the discussion and see if the community is >> willing to endorse this initiative. If this is the case, we should probably >> involve people from the biodiversity community: Darwin Core experts, >> EoL/GBIF representatives etc. But that will come in time. >> >> I look forward to further discussions. >> Regards, >> Franck. >> >> [1] Michel F., Gargominy O., Tercerie S. & >> >> -- http://carlboettiger.info
Received on Wednesday, 20 June 2018 22:06:53 UTC