Re: [Urgent] Proposed changes to Bioschemas

Thanks to those who have already reviewed and commented on the document.

I have added an addendum to the document which includes the proposal for the DataRecord type. Sorry for the omission of this in the original version of the document.

Best regards

Alasdair

On 25 Oct 2018, at 16:27, Gray, Alasdair J G <A.J.G.Gray@hw.ac.uk<mailto:A.J.G.Gray@hw.ac.uk>> wrote:


Dear Bioschemas Community

It has been a little over a year since we had the last community face-to-face, during which time we have achieved a lot, with 44 resources publishing markup on over 6 million web pages [1]. Just over a month ago we also saw the launch of Google's Dataset Search [2] through which we should see some of the promised benefit from the Bioschemas markup.

In the last few weeks, there have been several active discussions on github issues (210, 215, 217, 218, 220, 221, 222, 223) [3] relating to the extension of schema.org<http://schema.org/> types and properties for life sciences. Bioschemas is intended as a simple markup mechanism that should be easy to implement for providers and consume by tools. In practice we have made things hard.

Based upon our experiences, including those from running several tutorials with groups outside of Bioschemas “family”, we have identified two main problems with our current community approach of using existing life sciences ontology classes.

  *   The first problem is that generic consumers of the markup, e.g. search engines such as Google, will not understand the life sciences ontology classes; these services only understand types and properties in the schema.org<http://schema.org/> vocabulary and this will not change. Consequently, under the current approach, these generic services will not be able to distinguish between a BioChemEntity that is a Protein or a Gene, they will just understand them all as BioChemEntity. Thus there will be no benefit to the resources (eg. individual databases) in these services (Google) consuming the markup.

  *
The second is that the choice of classes and terms from specific life sciences ontologies are not compatible with the nature of the schema.org<http://schema.org/> vocabulary. This leads to logical inconsistencies for services that consume the markup.

To overcome these challenges, we propose that a limited number of new types and properties should be added to schema.org<http://schema.org/> as hosted extensions. These have been developed in discussion with Dan Brickley (chair of the schema.org<http://schema.org/> community group) and will serve as bridging terms between the generic schema.org<http://schema.org/> vocabulary and the more specific life sciences ontologies. We anticipate that there will be further types proposed in the future, e.g. chemical.

The proposal is available in the following google document. Only comment permissions have been granted so that the original proposal is unchanged.

https://docs.google.com/document/d/1Cw9K25N1l-Lbet1cahJuFtYgNKiF76apGcCqJPSeuZg/edit?usp=sharing


In order that these changes can be in place by the biohackathon we request any comments on these proposals are made by 1 November.

Best regards

Alasdair, Leyla, Sarala, Nick, Carole, and Rafa

[1] http://bioschemas.org/liveDeploys/


[2] https://toolbox.google.com/datasetsearch


[3] https://github.com/BioSchemas/specifications/issues/


--
Alasdair J G Gray
Associate Professor in Computer Science,
School of Mathematical and Computer Sciences
Heriot-Watt University, Edinburgh, UK.

Email: A.J.G.Gray@hw.ac.uk<mailto:A.J.G.Gray@hw.ac.uk>
Web: http://www.macs.hw.ac.uk/~ajg33

ORCID: http://orcid.org/0000-0002-5711-4872

Office: Earl Mountbatten Building 1.39
Twitter: @gray_alasdair

________________________________

Heriot-Watt University is The Times & The Sunday Times International University of the Year 2018

Founded in 1821, Heriot-Watt is a leader in ideas and solutions. With campuses and students across the entire globe we span the world, delivering innovation and educational excellence in business, engineering, design and the physical, social and life sciences.

This email is generated from the Heriot-Watt University Group, which includes:

  1.  Heriot-Watt University, a Scottish charity registered under number SC000278
  2.  Edinburgh Business School a Charity Registered in Scotland, SC026900. Edinburgh Business School is a company limited by guarantee, registered in Scotland with registered number SC173556 and registered office at Heriot-Watt University Finance Office, Riccarton, Currie, Midlothian, EH14 4AS
  3.  Heriot- Watt Services Limited (Oriam), Scotland's national performance centre for sport. Heriot-Watt Services Limited is a private limited company registered is Scotland with registered number SC271030 and registered office at Research & Enterprise Services Heriot-Watt University, Riccarton, Edinburgh, EH14 4AS.

The contents (including any attachments) are confidential. If you are not the intended recipient of this e-mail, any disclosure, copying, distribution or use of its contents is strictly prohibited, and you should please notify the sender immediately and then delete it (including any attachments) from your system.

--
Alasdair J G Gray
Associate Professor in Computer Science,
School of Mathematical and Computer Sciences
Heriot-Watt University, Edinburgh, UK.

Email: A.J.G.Gray@hw.ac.uk<mailto:A.J.G.Gray@hw.ac.uk>
Web: http://www.macs.hw.ac.uk/~ajg33

ORCID: http://orcid.org/0000-0002-5711-4872

Office: Earl Mountbatten Building 1.39
Twitter: @gray_alasdair

Received on Friday, 26 October 2018 14:52:23 UTC