- From: Justin Clark-Casey <justinccdev@gmail.com>
- Date: Wed, 30 Sep 2020 11:17:24 +0100
- To: "Gray, Alasdair J G" <A.J.G.Gray@hw.ac.uk>
- Cc: Franck Michel <franck.michel@cnrs.fr>, Dan Brickley <danbri@danbri.org>, "public-bioschemas@w3.org" <public-bioschemas@w3.org>
- Message-ID: <CAME9NR8PqL4=f6QOcLtS7rkNrSR_4hysJ6oHi3OWVmQavD8Cmw@mail.gmail.com>
No apologies necessary! I'm always a bit of a diva when it comes to previous projects. I'm hoping to take a look at BMUSE myself when I can. All the best, Justin On Wed, 30 Sep 2020 at 11:05, Gray, Alasdair J G <A.J.G.Gray@hw.ac.uk> wrote: > Justin, > > > > Apologies for my clumsy language. > > > > Alasdair > > > > -- > > Alasdair J G Gray > > Associate Professor in Computer Science, > School of Mathematical and Computer Sciences > Heriot-Watt University, Edinburgh, UK. > > Email: A.J.G.Gray@hw.ac.uk <A.J.G.Gray@hw.ac.uk> > Web: http://www.macs.hw.ac.uk/~ajg33 > ORCID: http://orcid.org/0000-0002-5711-4872 > Office: Earl Mountbatten Building 1.39 > Twitter: @gray_alasdair > > > > > > Heriot-Watt is a global University, as a result my working hours may not > be your working hours. Do not feel pressure to reply to this email outside > your working hours. > > > > > > To arrange a meeting: https://doodle.com/mm/alasdairgray/book-a-time > > > > > > *From: *"justinccdev@gmail.com" <justinccdev@gmail.com> > *Date: *Wednesday, 30 September 2020 at 10:27 > *To: *Alasdair Gray <A.J.G.Gray@hw.ac.uk> > *Cc: *"franck.michel@cnrs.fr" <franck.michel@cnrs.fr>, "danbri@danbri.org" > <danbri@danbri.org>, "public-bioschemas@w3.org" <public-bioschemas@w3.org> > *Subject: *Re: Abstract for oral presentation accepted at TDWG 2020 > > > > > ***************************************************************** * > *Caution: This email originated from a sender outside Heriot-Watt > University. Do not follow links or open attachments if you doubt the > authenticity of the sender or the content. * > * ***************************************************************** > > > > "Abandoned" is a bit harsh, Alasdair :). I'm going to say "wound down" as > both myself, Ankit and Ricardo had to go on to other things. And there was > a nasty second system effect where I was way too ambitious with its second > iteration which unfortunately didn't leave it in an operational state > (unless you know differently, Ankit). > > > > But yeah, BMUSE is definitely the thing to look at for checking if the > markup could be crawled. > > > > Best, > > > > -- > Justin Clark-Casey > EOSC Programme Manager > EMBL-EBI > > > > On Wed, 30 Sep 2020 at 09:38, Gray, Alasdair J G <A.J.G.Gray@hw.ac.uk> > wrote: > > Hi > > > > BMUSE is a scraper that we are actively developing to be able to scrape > Bioschemas markup from sites. It is capable of crawling both static and > single page application sites. As input it takes either a list of URLs or a > sitemap (release just being made). > > > > We are about to start a few directed crawls using BMUSE. The first of > which will be targeted at gathering data pertinent to COVID-19 and making > that available at a single point for further processing. > > > > With regard to buzzbang, that was an earlier effort at crawling which has > since been abandoned. BMUSE is taking the ideas of buzzbang and expanding > on them. Hopefully we can offer some sort of graphical explorer over the > crawled data at some point in the future. > > > > Best regards > > > > Alasdair > > > > -- > > Alasdair J G Gray > > Associate Professor in Computer Science, > School of Mathematical and Computer Sciences > Heriot-Watt University, Edinburgh, UK. > > Email: A.J.G.Gray@hw.ac.uk <A.J.G.Gray@hw.ac.uk> > Web: http://www.macs.hw.ac.uk/~ajg33 > ORCID: http://orcid.org/0000-0002-5711-4872 > Office: Earl Mountbatten Building 1.39 > Twitter: @gray_alasdair > > > > > > Heriot-Watt is a global University, as a result my working hours may not > be your working hours. Do not feel pressure to reply to this email outside > your working hours. > > > > > > To arrange a meeting: https://doodle.com/mm/alasdairgray/book-a-time > > > > > > *From: *"franck.michel@cnrs.fr" <franck.michel@cnrs.fr> > *Date: *Tuesday, 29 September 2020 at 17:11 > *To: *"danbri@danbri.org" <danbri@danbri.org> > *Cc: *"public-bioschemas@w3.org" <public-bioschemas@w3.org> > *Subject: *Re: Abstract for oral presentation accepted at TDWG 2020 > *Resent from: *"public-bioschemas@w3.org" <public-bioschemas@w3.org> > *Resent date: *Tuesday, 29 September 2020 at 17:10 > > > > > ***************************************************************** * > *Caution: This email originated from a sender outside Heriot-Watt > University. Do not follow links or open attachments if you doubt the > authenticity of the sender or the content. * > * ***************************************************************** > > > > Hi Dan, > > About the XML sitemaps I don't know, I'll ask that to the MNHN webmasters. > > About a tool to crawle the markup, last summer I tried BMUSE > <https://github.com/HW-SWeL/BMUSE>. that scrapes pages given by URL, but > it may have an option to cope with sitemaps. To be checked. > Anyway, for one page of the MNHN it works fine and returns an ntriple file. > > Rgds, > Franck. > > Le 29/09/2020 à 16:59, Dan Brickley a écrit : > > > > This is great - congratulations! Does anyone from the bioschemas community > have a crawler that could be applied to > https://inpn.mnhn.fr/accueil/index?lg=en ? do you publish XML sitemaps > that could make it easier for people to find and crawl this data? > > > > cheers, > > > > Dan > > > > > > On Tue, 29 Sep 2020 at 15:43, Franck Michel <franck.michel@cnrs.fr> wrote: > > Dear all, > > As you may know TDWG 2020 <https://www.tdwg.org/conferences/2020/> was > rescheduled as as online conference taking place during 2 weeks: working > sessions (Sep 21-25) and dissemination and sharing sessions (Oct 19-23). > > We have an abstract accepted for oral presentation in the second week.It > is called: "Unleash the Potential of your Website! 180,000 webpages from > the French Natural History Museum marked up with Bioschemas/Schema.org > biodiversity types" => https://doi.org/10.3897/biss.4.59046 > > Thanks to all for your contributions to this work. > > Franck. > > > -------- Message transféré -------- > > *Sujet : * > > [Biodiversity Information Science and Standards] Submission #59046: > Manuscript Published > > *Date de renvoi : * > > Tue, 29 Sep 2020 13:33:09 +0200 (CEST) > > *De (renvoi) : * > > franck.michel@cnrs.fr > > *Date : * > > Tue, 29 Sep 2020 14:33:04 +0300 > > *De : * > > Biodiversity Information Science and Standards <biss@pensoft.net> > <biss@pensoft.net> > > *Pour : * > > franck.michel@cnrs.fr > > > > Dear Franck Michel: > > We are pleased to inform you that your paper #59046 "Unleash the > Potential of your Website! 180,000 webpages from the French Natural History > Museum marked up with Bioschemas/Schema.org biodiversity types > <https://biss.pensoft.net/article/59046/>" was published in Biodiversity > Information Science and Standards, doi: 10.3897/biss.4.59046. Thank you for > choosing Biodiversity Information Science and Standards as a publication > venue for your work! > > We suggest that you help us increase the visibility of your study and > thereby boost its citations and impact by sharing it on social media (i.e. > Twitter, Facebook, Mendeley etc.), ideally using both your own and your > institution’s channels. Information and suggestions on how to promote your > work to the international scientific audience and wider public can be found > on our website <https://biss.pensoft.net/about#ScienceCommunication>. > > You may also order high-quality full-color reprints of your article > through our order form <https://goto.arphahub.com/JoZoQ63meQLy>. > > To keep yourself updated about research published in your scientific > field, you can set up email alerts for Biodiversity Information Science and > Standards via this link <https://goto.arphahub.com/1jKwlrASAWvL> or > through your user profile. You can change the research topics, journals or > frequency of these email alerts anytime. > > Biodiversity Information Science and Standards Editorial office > ___________________ > Pensoft Publishers <https://pensoft.net> > ARPHA Platform <https://arphahub.com/> > Biodiversity Information Science and Standards on Twitter > <https://twitter.com/BISS_Journal> and Facebook > <https://www.facebook.com/BISSJournal> > > PLEASE DO NOT FORWARD THIS EMAIL, IT CONTAINS YOUR PERSONAL AUTO LOGIN > LINK. > > > ------------------------------ > > Founded in 1821, Heriot-Watt is a leader in ideas and solutions. With > campuses and students across the entire globe we span the world, delivering > innovation and educational excellence in business, engineering, design and > the physical, social and life sciences. This email is generated from the > Heriot-Watt University Group, which includes: > > 1. Heriot-Watt University, a Scottish charity registered under number > SC000278 > > 2. Heriot- Watt Services Limited (Oriam), Scotland's national > performance centre for sport. Heriot-Watt Services Limited is a private > limited company registered is Scotland with registered number SC271030 and > registered office at Research & Enterprise Services Heriot-Watt University, > Riccarton, Edinburgh, EH14 4AS. > > The contents (including any attachments) are confidential. If you are not > the intended recipient of this e-mail, any disclosure, copying, > distribution or use of its contents is strictly prohibited, and you should > please notify the sender immediately and then delete it (including any > attachments) from your system. > >
Received on Wednesday, 30 September 2020 10:18:15 UTC