W3C home > Mailing lists > Public > public-bioschemas@w3.org > September 2020

Re: Abstract for oral presentation accepted at TDWG 2020

From: Gray, Alasdair J G <A.J.G.Gray@hw.ac.uk>
Date: Wed, 30 Sep 2020 08:37:20 +0000
To: Franck Michel <franck.michel@cnrs.fr>, Dan Brickley <danbri@danbri.org>
CC: "public-bioschemas@w3.org" <public-bioschemas@w3.org>
Message-ID: <D7694EFD-FD62-4B7A-8B99-DE715B23BF3A@hw.ac.uk>

BMUSE is a scraper that we are actively developing to be able to scrape Bioschemas markup from sites. It is capable of crawling both static and single page application sites. As input it takes either a list of URLs or a sitemap (release just being made).

We are about to start a few directed crawls using BMUSE. The first of which will be targeted at gathering data pertinent to COVID-19 and making that available at a single point for further processing.

With regard to buzzbang, that was an earlier effort at crawling which has since been abandoned. BMUSE is taking the ideas of buzzbang and expanding on them. Hopefully we can offer some sort of graphical explorer over the crawled data at some point in the future.

Best regards


Alasdair J G Gray
Associate Professor in Computer Science,
School of Mathematical and Computer Sciences
Heriot-Watt University, Edinburgh, UK.

Email: A.J.G.Gray@hw.ac.uk<mailto:A.J.G.Gray@hw.ac.uk>
Web: http://www.macs.hw.ac.uk/~ajg33

ORCID: http://orcid.org/0000-0002-5711-4872

Office: Earl Mountbatten Building 1.39
Twitter: @gray_alasdair

Heriot-Watt is a global University, as a result my working hours may not be your working hours. Do not feel pressure to reply to this email outside your working hours.

To arrange a meeting: https://doodle.com/mm/alasdairgray/book-a-time

From: "franck.michel@cnrs.fr" <franck.michel@cnrs.fr>
Date: Tuesday, 29 September 2020 at 17:11
To: "danbri@danbri.org" <danbri@danbri.org>
Cc: "public-bioschemas@w3.org" <public-bioschemas@w3.org>
Subject: Re: Abstract for oral presentation accepted at TDWG 2020
Resent from: "public-bioschemas@w3.org" <public-bioschemas@w3.org>
Resent date: Tuesday, 29 September 2020 at 17:10

Caution: This email originated from a sender outside Heriot-Watt University.
Do not follow links or open attachments if you doubt the authenticity of the sender or the content.

Hi Dan,

About the XML sitemaps I don't know, I'll ask that to the MNHN webmasters.

About a tool to crawle the markup, last summer I tried BMUSE<https://github.com/HW-SWeL/BMUSE>. that scrapes pages given by URL, but it may have an option to cope with sitemaps. To be checked.
Anyway, for one page of the MNHN it works fine and returns an ntriple file.

Le 29/09/2020 à 16:59, Dan Brickley a écrit :

This is great - congratulations! Does anyone from the bioschemas community have a crawler that could be applied to https://inpn.mnhn.fr/accueil/index?lg=en ? do you publish XML sitemaps that could make it easier for people to find and crawl this data?



On Tue, 29 Sep 2020 at 15:43, Franck Michel <franck.michel@cnrs.fr<mailto:franck.michel@cnrs.fr>> wrote:
Dear all,

As you may know TDWG 2020<https://www.tdwg.org/conferences/2020/> was rescheduled as as online conference taking place during 2 weeks: working sessions (Sep 21-25) and dissemination and sharing sessions (Oct 19-23).

We have an abstract accepted for oral presentation in the second week.It is called: "Unleash the Potential of your Website! 180,000 webpages from the French Natural History Museum marked up with Bioschemas/Schema.org biodiversity types" => https://doi.org/10.3897/biss.4.59046

Thanks to all for your contributions to this work.


-------- Message transféré --------
Sujet :
[Biodiversity Information Science and Standards] Submission #59046: Manuscript Published
Date de renvoi :
Tue, 29 Sep 2020 13:33:09 +0200 (CEST)
De (renvoi) :
Date :
Tue, 29 Sep 2020 14:33:04 +0300
De :
Biodiversity Information Science and Standards <biss@pensoft.net><mailto:biss@pensoft.net>
Pour :

Dear Franck Michel:

We are pleased to inform you that your paper #59046 "Unleash the Potential of your Website! 180,000 webpages from the French Natural History Museum marked up with Bioschemas/Schema.org biodiversity types<https://biss.pensoft.net/article/59046/>" was published in Biodiversity Information Science and Standards, doi: 10.3897/biss.4.59046. Thank you for choosing Biodiversity Information Science and Standards as a publication venue for your work!

We suggest that you help us increase the visibility of your study and thereby boost its citations and impact by sharing it on social media (i.e. Twitter, Facebook, Mendeley etc.), ideally using both your own and your institution’s channels. Information and suggestions on how to promote your work to the international scientific audience and wider public can be found on our website<https://biss.pensoft.net/about#ScienceCommunication>.

You may also order high-quality full-color reprints of your article through our order form<https://goto.arphahub.com/JoZoQ63meQLy>.

To keep yourself updated about research published in your scientific field, you can set up email alerts for Biodiversity Information Science and Standards via this link<https://goto.arphahub.com/1jKwlrASAWvL> or through your user profile. You can change the research topics, journals or frequency of these email alerts anytime.

Biodiversity Information Science and Standards Editorial office
Pensoft Publishers<https://pensoft.net>
ARPHA Platform<https://arphahub.com/>
Biodiversity Information Science and Standards on Twitter<https://twitter.com/BISS_Journal> and Facebook<https://www.facebook.com/BISSJournal>



Founded in 1821, Heriot-Watt is a leader in ideas and solutions. With campuses and students across the entire globe we span the world, delivering innovation and educational excellence in business, engineering, design and the physical, social and life sciences. This email is generated from the Heriot-Watt University Group, which includes:

  1.  Heriot-Watt University, a Scottish charity registered under number SC000278
  2.  Heriot- Watt Services Limited (Oriam), Scotland's national performance centre for sport. Heriot-Watt Services Limited is a private limited company registered is Scotland with registered number SC271030 and registered office at Research & Enterprise Services Heriot-Watt University, Riccarton, Edinburgh, EH14 4AS.

The contents (including any attachments) are confidential. If you are not the intended recipient of this e-mail, any disclosure, copying, distribution or use of its contents is strictly prohibited, and you should please notify the sender immediately and then delete it (including any attachments) from your system.
Received on Wednesday, 30 September 2020 08:37:38 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 30 September 2020 08:37:41 UTC