Re: Buzzbang crawler and search release 0.0.2 now available

Hi Justin,

This is great. I would like to see whether I can integrate your crawler within identifiers.org for CataCatalog and DateSets. Keep me posted...

Regards,

Sarala M Wimalaratne, B.Eng. PhD
Project Lead
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD UK

> On 19 Oct 2017, at 12:31, Justin Clark-Casey <justinccdev@gmail.com> wrote:
> 
> Hi all,
> 
> Following on from the Bioschemas adoption meeting, I'm continuing to work on the extremely alpha Buzzbang Bioschemas crawler and frontend when I can (renamed from BsBang, after Alistair pointed out the connotations of 'bs' :)).
> 
> You can play with the current search engine by going to http://buzzbang.science <http://buzzbang.science/>
> 
> In this release, I decided to concentrate on indexing DataCatalog (this is extremely primitive as of yet, only recording the name, url, description and keywords properties).  If you go to buzzbang.science and search for terms such as 'data' or 'registry' you'll get some results.
> 
> Currently, I'm manually adding URLs - you can see the small list at [1]. I added those that have DataCatalog JSON+LD embedded that I had in my notes, such as identifiers.org <http://identifiers.org/> and fairsharing.org <http://fairsharing.org/>. Down the road, users will be able to submit URLs for crawling directly on the website, but for now, please contact me, raise a Github issue [2] or submit a pull request if there's an URL I can add.
> 
> Next, I plan to crawl the rest of DataCatalog, esp. embedded DataSets and think about how that information can help improve simple search. 
> 
> All feature suggestions or pull requests welcome on the Github crawler [2] and search frontend [3] projects.
> 
> [1] https://github.com/justinccdev/bsbang-crawler/blob/master/conf/default-targets.txt <https://github.com/justinccdev/bsbang-crawler/blob/master/conf/default-targets.txt>
> [2] https://github.com/justinccdev/bsbang-crawler <https://github.com/justinccdev/bsbang-crawler>
> [3] https://github.com/justinccdev/bsbang-frontend <https://github.com/justinccdev/bsbang-frontend>
> 
> Cheers,
> 
> -- 
> Justin Clark-Casey (@justincc)
> Research Software Architect
> Micklem Lab, University of Cambridge

Received on Friday, 20 October 2017 09:33:55 UTC