Revising the Bioschemas Profile Development Process

Hi
(Apologies for the long email, but this is an important decision point for the community.)

For a long time, we have been looking for an alternative approach for profile development. A key aspect that we have been trying to keep is the ability to edit profiles with minimal technical skills, akin to the usability of the GSheets. At the same time, we wanted to simplify the processes for publishing a profile to the website and to support a machine processable representation such as JSON-Schema, SHACL, or ShEx as the normative form of the profile.

We believe that the Data Discovery Engine<https://discovery.biothings.io/view/bioschemas/> (DDE) supports most of what we want to achieve (the whole process is outlined at the end of this email). During the recent BioHackathon we have been exploring and testing the functionality offered by DDE, and identifying what processes would be needed to convert the DDE representation to the website, which we believe can be automated with GitHub actions. This slide shows some screenshots of where we have got to.
https://docs.google.com/presentation/d/1pbpgovqiTUe9fwkJchL3SJ6txm6O7QHUTuc_0k-iZxY/edit#slide=id.gfa7382eed9_0_138

From our investigation, we have found that the DDE provides:

  *   Easy UI for editing and creating profiles
     *   Existing Schema.org and Bioschemas types can be extended and revised
     *   We have seeded the registry with the current Bioschemas profiles
  *   Property names, descriptions, and expected types come directly from Schema.org definition
  *   Cardinality can be captured
  *   Marginality levels can be captured using a radio button (this was added recently as a feature request for us)
  *   Controlled vocabularies: part of the validation editor with ontology terms needing to be in the EBI Ontology Lookup Service (OLS)

We have found the following limitations of the DDE:

  *   Additional descriptions/explanations: recognised by DDE team as a useful thing, a feature request has been submitted
  *   Examples cannot be included in the UI: potentially possible to encode them directly in the JSON-LD file by hand
  *   For new properties and types, a formal mapping to the term(s) that have informed the design
     *   Note that we do not have this functionality in the current Bioschemas approach either
  *   Unclear which version of Schema.org is currently loaded in DDE
     *   We’ve put in some issues on their tracker around this
     *   Current Bioschemas approach does not capture this either
  *   Profile metadata not visible or editable in the DDE: this would need to be hand crafted in the JSON-LD

Ahead of the Steering Council meeting to discuss this in early December, we would like to gather community feedback on the proposed move to using the DDE as the Bioschemas profile editor, rather than the existing GSheet approach. In your opinion, are any of the limitations a showstopper for the proposed move to DDE? Are there other things/services we should be considering?

Many thanks

Alasdair (on behalf of the Steering Council)


Using DDE to Edit Bioschemas Profiles

To update an existing profile, you need to login to the DDE with a GitHub account. You then use the Schema Playground to select the profile you are updating. This then gives you a graphic user interface for updating the existing profile. Once you are happy with your edits, you choose to save them back to Bioschemas Specification repository on GitHub. A screenshot of the interface is available at the following link
https://docs.google.com/document/d/1VB6qZOgtDfy_bbs-dDGyNMRyNDw5N-9JV4lekICUNQo/edit#heading=h.av3k32co4ye0
(We’ll need to supply a tutorial on how to do this.)

Required Infrastructure/Workflow

Normative forms of the profiles would be stored as JSON-Schema in the Bioschemas Specification repository<https://github.com/BioSchemas/specifications/>. The following link is the ChemicalSubstance 0.4-RELEASE profile in the JSON-LD representation of the JSON-Schema that the DDE uses
https://github.com/BioSchemas/specifications/blob/master/ChemicalSubstance/jsonld/ChemicalSubstance_v0.4-RELEASE.json

To use the individual profiles in the DDE, they are merged into a single file. The scripts for doing this are contained in Bioschemas DDE repository<https://github.com/BioSchemas/bioschemas-dde>, with the versions used being controlled by the specifications_list.txt<https://github.com/BioSchemas/bioschemas-dde/blob/main/specifications_list.txt> file. This process can be automated with a GitHub Action that fires on commit.

To display on the Bioschemas web site, we need to strip out some characters from the JSON-LD file, e.g. `@` signs, and have the updated version of the json-ld file pushed into the website repository, this should be possible through a GitHub Action. The below is a screenshot showing that we can strip-out and display values from the resulting json file. We need to fully develop the display template.
https://docs.google.com/document/d/1VB6qZOgtDfy_bbs-dDGyNMRyNDw5N-9JV4lekICUNQo/edit#heading=h.z3oj4x13fqiy

--
Alasdair J G Gray
Associate Professor in Computer Science,
School of Mathematical and Computer Sciences
Heriot-Watt University, Edinburgh, UK.

Email: A.J.G.Gray@hw.ac.uk<mailto:A.J.G.Gray@hw.ac.uk>
Web: http://www.macs.hw.ac.uk/~ajg33
ORCID: http://orcid.org/0000-0002-5711-4872
Office: Earl Mountbatten Building 1.39
Twitter: @gray_alasdair


Heriot-Watt is a global University, as a result my working hours may not be your working hours. Do not feel pressure to reply to this email outside your working hours.


To arrange a meeting: https://outlook.office365.com/owa/calendar/AlasdairGray@heriotwatt.onmicrosoft.com/bookings/
________________________________

Founded in 1821, Heriot-Watt is a leader in ideas and solutions. With campuses and students across the entire globe we span the world, delivering innovation and educational excellence in business, engineering, design and the physical, social and life sciences. This email is generated from the Heriot-Watt University Group, which includes:

  1.  Heriot-Watt University, a Scottish charity registered under number SC000278
  2.  Heriot- Watt Services Limited (Oriam), Scotland's national performance centre for sport. Heriot-Watt Services Limited is a private limited company registered is Scotland with registered number SC271030 and registered office at Research & Enterprise Services Heriot-Watt University, Riccarton, Edinburgh, EH14 4AS.

The contents (including any attachments) are confidential. If you are not the intended recipient of this e-mail, any disclosure, copying, distribution or use of its contents is strictly prohibited, and you should please notify the sender immediately and then delete it (including any attachments) from your system.

Received on Tuesday, 23 November 2021 13:40:54 UTC