Re: Buzzbang crawler and search release 0.0.2 now available

I have a slight preference for Shex personally but the most official in W3C
terms is Shacl.  Anyone else have a view?

On 20 Oct 2017 20:08, "Justin Clark-Casey" <justinccdev@gmail.com> wrote:

> Thanks Dan.  Yes, I need to look into SHACL/SHEX - I only have a passing
> acquaintance with them at the moment.  Would you recommend either one over
> the other?
>
> Regards,
>
> -- Justin
>
> On Thu, Oct 19, 2017 at 3:20 PM, Dan Brickley <danbri@danbri.org> wrote:
>
>> This sounds great! It would be interesting to try to write down the
>> specific data patterns you're extracting, by using W3C SHACL or SHEX shape
>> markup. I will be attempting the same for Google...
>>
>> Dan
>>
>> On 19 Oct 2017 12:37, "Justin Clark-Casey" <justinccdev@gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> Following on from the Bioschemas adoption meeting, I'm continuing to
>>> work on the extremely alpha Buzzbang Bioschemas crawler and frontend when I
>>> can (renamed from BsBang, after Alistair pointed out the connotations of
>>> 'bs' :)).
>>>
>>> You can play with the current search engine by going to
>>> http://buzzbang.science
>>>
>>> In this release, I decided to concentrate on indexing DataCatalog (this
>>> is extremely primitive as of yet, only recording the name, url, description
>>> and keywords properties).  If you go to buzzbang.science and search for
>>> terms such as 'data' or 'registry' you'll get some results.
>>>
>>> Currently, I'm manually adding URLs - you can see the small list at [1].
>>> I added those that have DataCatalog JSON+LD embedded that I had in my
>>> notes, such as identifiers.org and fairsharing.org. Down the road,
>>> users will be able to submit URLs for crawling directly on the website, but
>>> for now, please contact me, raise a Github issue [2] or submit a pull
>>> request if there's an URL I can add.
>>>
>>> Next, I plan to crawl the rest of DataCatalog, esp. embedded DataSets
>>> and think about how that information can help improve simple search.
>>>
>>> All feature suggestions or pull requests welcome on the Github crawler
>>> [2] and search frontend [3] projects.
>>>
>>> [1] https://github.com/justinccdev/bsbang-crawler/blob/master/co
>>> nf/default-targets.txt
>>> [2] https://github.com/justinccdev/bsbang-crawler
>>> [3] https://github.com/justinccdev/bsbang-frontend
>>>
>>> Cheers,
>>>
>>> --
>>> Justin Clark-Casey (@justincc)
>>> Research Software Architect
>>> Micklem Lab, University of Cambridge
>>>
>>
>

Received on Friday, 20 October 2017 21:10:31 UTC