Buzzbang crawler and search release 0.0.2 now available from Justin Clark-Casey on 2017-10-19 (public-bioschemas@w3.org from October 2017)

From: Justin Clark-Casey <justinccdev@gmail.com>
Date: Thu, 19 Oct 2017 12:31:02 +0100
To: public-bioschemas@w3.org
Message-ID: <CAME9NR_wMUojpvUp7hoR8O19=V+1CprRoo6f6KR7rMYGvSj29Q@mail.gmail.com>

Hi all,

Following on from the Bioschemas adoption meeting, I'm continuing to work
on the extremely alpha Buzzbang Bioschemas crawler and frontend when I can
(renamed from BsBang, after Alistair pointed out the connotations of 'bs'
:)).

You can play with the current search engine by going to
http://buzzbang.science

In this release, I decided to concentrate on indexing DataCatalog (this is
extremely primitive as of yet, only recording the name, url, description
and keywords properties).  If you go to buzzbang.science and search for
terms such as 'data' or 'registry' you'll get some results.

Currently, I'm manually adding URLs - you can see the small list at [1]. I
added those that have DataCatalog JSON+LD embedded that I had in my notes,
such as identifiers.org and fairsharing.org. Down the road, users will be
able to submit URLs for crawling directly on the website, but for now,
please contact me, raise a Github issue [2] or submit a pull request if
there's an URL I can add.

Next, I plan to crawl the rest of DataCatalog, esp. embedded DataSets and
think about how that information can help improve simple search.

All feature suggestions or pull requests welcome on the Github crawler [2]
and search frontend [3] projects.

[1]
https://github.com/justinccdev/bsbang-crawler/blob/master/conf/default-targets.txt
[2] https://github.com/justinccdev/bsbang-crawler
[3] https://github.com/justinccdev/bsbang-frontend

Cheers,

-- 
Justin Clark-Casey (@justincc)
Research Software Architect
Micklem Lab, University of Cambridge

Received on Thursday, 19 October 2017 11:36:50 UTC