Re: Buzzbang crawler and search release 0.0.2 now available

I used ShEx in the past but that was before SHACL. I do like the intuitive concise notation.

However my postdoc has been looking at both and finding the SHACL examples more accessible. Also the SHACL framework seemed more up to date.

For most purposes there probably isn't much in it. It would be good to know what are the things that can be done in one and not the other.

Alasdair

Alasdair J G Gray
Fellow of the Higher Education Academy
Assistant Professor in Computer Science
Herriot-Watt University, Edinburgh

www.macs.hw.ac.uk/~ajg33<http://<br/>www.macs.hw.ac.uk/~ajg33>

________________________________
From: Andra Waagmeester <andra@micelio.be>
Sent: Saturday, October 21, 2017 7:18:14 AM
To: Dan Brickley
Cc: Justin Clark-Casey; public-bioschemas@w3.org
Subject: Re: Buzzbang crawler and search release 0.0.2 now available

I also have a preference for ShEx. The syntax feels more intuitive. However, just recently a book describing and comparing both ShEx and Shacl was released: http://www.morganclaypoolpublishers.com/catalog_Orig/product_info.php?products_id=1091

Personally, I I like the regex style of expressing cardinalities and the possibility to combine different shapes for similar concepts in Shex.

On Fri, Oct 20, 2017 at 11:09 PM, Dan Brickley <danbri@danbri.org<mailto:danbri@danbri.org>> wrote:
I have a slight preference for Shex personally but the most official in W3C terms is Shacl.  Anyone else have a view?

On 20 Oct 2017 20:08, "Justin Clark-Casey" <justinccdev@gmail.com<mailto:justinccdev@gmail.com>> wrote:
[https://mail.google..com/mail/u/0/images/cleardot.gif]Thanks Dan.  Yes, I need to look into SHACL/SHEX - I only have a passing acquaintance with them at the moment.  Would you recommend either one over the other?

Regards,

-- Justin

On Thu, Oct 19, 2017 at 3:20 PM, Dan Brickley <danbri@danbri.org<mailto:danbri@danbri.org>> wrote:
This sounds great! It would be interesting to try to write down the specific data patterns you're extracting, by using W3C SHACL or SHEX shape markup. I will be attempting the same for Google...

Dan

On 19 Oct 2017 12:37, "Justin Clark-Casey" <justinccdev@gmail.com<mailto:justinccdev@gmail.com>> wrote:
Hi all,

Following on from the Bioschemas adoption meeting, I'm continuing to work on the extremely alpha Buzzbang Bioschemas crawler and frontend when I can (renamed from BsBang, after Alistair pointed out the connotations of 'bs' :)).

You can play with the current search engine by going to http://buzzbang.science

In this release, I decided to concentrate on indexing DataCatalog (this is extremely primitive as of yet, only recording the name, url, description and keywords properties).  If you go to buzzbang.science and search for terms such as 'data' or 'registry' you'll get some results.

Currently, I'm manually adding URLs - you can see the small list at [1]. I added those that have DataCatalog JSON+LD embedded that I had in my notes, such as identifiers.org<http://identifiers.org> and fairsharing.org<http://fairsharing.org>. Down the road, users will be able to submit URLs for crawling directly on the website, but for now, please contact me, raise a Github issue [2] or submit a pull request if there's an URL I can add.

Next, I plan to crawl the rest of DataCatalog, esp. embedded DataSets and think about how that information can help improve simple search.

All feature suggestions or pull requests welcome on the Github crawler [2] and search frontend [3] projects.

[1] https://github.com/justinccdev/bsbang-crawler/blob/master/conf/default-targets.txt
[2] https://github.com/justinccdev/bsbang-crawler
[3] https://github.com/justinccdev/bsbang-frontend

Cheers,

--
Justin Clark-Casey (@justincc)
Research Software Architect
Micklem Lab, University of Cambridge


________________________________

Heriot-Watt University is The Times & The Sunday Times International University of the Year 2018

Founded in 1821, Heriot-Watt is a leader in ideas and solutions. With campuses and students across the entire globe we span the world, delivering innovation and educational excellence in business, engineering, design and the physical, social and life sciences.

This email is generated from the Heriot-Watt University Group, which includes:

  1.  Heriot-Watt University, a Scottish charity registered under number SC000278
  2.  Edinburgh Business School a Charity Registered in Scotland, SC026900. Edinburgh Business School is a company limited by guarantee, registered in Scotland with registered number SC173556 and registered office at Heriot-Watt University Finance Office, Riccarton, Currie, Midlothian, EH14 4AS
  3.  Heriot- Watt Services Limited (Oriam), Scotland's national performance centre for sport. Heriot-Watt Services Limited is a private limited company registered is Scotland with registered number SC271030 and registered office at Research & Enterprise Services Heriot-Watt University, Riccarton, Edinburgh, EH14 4AS.

The contents (including any attachments) are confidential. If you are not the intended recipient of this e-mail, any disclosure, copying, distribution or use of its contents is strictly prohibited, and you should please notify the sender immediately and then delete it (including any attachments) from your system.

Received on Saturday, 21 October 2017 19:42:39 UTC