- From: Justin Clark-Casey <justinccdev@gmail.com>
- Date: Fri, 20 Oct 2017 19:55:31 +0100
- To: Sarala Wimalaratne <sarala@ebi.ac.uk>
- Cc: public-bioschemas@w3.org
- Message-ID: <CAME9NR9QOuOLhQDhZCMxzTMatpatK7HgykrRzY_csoybK2Mcwg@mail.gmail.com>
Hi Sarala, Yes, I'm modularizing the code as I go along, with a view to making it reusable by other projects. I'll keep the group posted as I make it a more complete crawler of DataCatalog and then DataSet. Regards, Justin On Fri, Oct 20, 2017 at 10:33 AM, Sarala Wimalaratne <sarala@ebi.ac.uk> wrote: > Hi Justin, > > This is great. I would like to see whether I can integrate your crawler > within identifiers.org for CataCatalog and DateSets. Keep me posted... > > Regards, > > Sarala M Wimalaratne, B.Eng. PhD > Project Lead > European Bioinformatics Institute (EMBL-EBI) > European Molecular Biology Laboratory > Wellcome Trust Genome Campus > Hinxton > Cambridge CB10 1SD UK > > On 19 Oct 2017, at 12:31, Justin Clark-Casey <justinccdev@gmail.com> > wrote: > > Hi all, > > Following on from the Bioschemas adoption meeting, I'm continuing to work > on the extremely alpha Buzzbang Bioschemas crawler and frontend when I can > (renamed from BsBang, after Alistair pointed out the connotations of 'bs' > :)). > > You can play with the current search engine by going to > http://buzzbang.science > > In this release, I decided to concentrate on indexing DataCatalog (this is > extremely primitive as of yet, only recording the name, url, description > and keywords properties). If you go to buzzbang.science and search for > terms such as 'data' or 'registry' you'll get some results. > > Currently, I'm manually adding URLs - you can see the small list at [1]. I > added those that have DataCatalog JSON+LD embedded that I had in my notes, > such as identifiers.org and fairsharing.org. Down the road, users will be > able to submit URLs for crawling directly on the website, but for now, > please contact me, raise a Github issue [2] or submit a pull request if > there's an URL I can add. > > Next, I plan to crawl the rest of DataCatalog, esp. embedded DataSets and > think about how that information can help improve simple search. > > All feature suggestions or pull requests welcome on the Github crawler [2] > and search frontend [3] projects. > > [1] https://github.com/justinccdev/bsbang-crawler/ > blob/master/conf/default-targets.txt > [2] https://github.com/justinccdev/bsbang-crawler > [3] https://github.com/justinccdev/bsbang-frontend > > Cheers, > > -- > Justin Clark-Casey (@justincc) > Research Software Architect > Micklem Lab, University of Cambridge > > >
Received on Friday, 20 October 2017 18:55:57 UTC